[Bug 1415510] Re: Frequent kernel panics when doing heavy I/O in LXC containers on Btrfs
** Tags removed: kernel-key ** Tags added: kernel-da-key -- You received this bug notification because you are a member of Ubuntu Bugs, which is subscribed to Ubuntu. https://bugs.launchpad.net/bugs/1415510 Title: Frequent kernel panics when doing heavy I/O in LXC containers on Btrfs To manage notifications about this bug go to: https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1415510/+subscriptions -- ubuntu-bugs mailing list ubuntu-bugs@lists.ubuntu.com https://lists.ubuntu.com/mailman/listinfo/ubuntu-bugs
[Bug 1415510] Re: Frequent kernel panics when doing heavy I/O in LXC containers on Btrfs
Hi Joseph, thanks for your quick reply. The problems began some time after I started using LXC with Btrfs on Trusty, but I'm not sure if a particular kernel update introduced a regression. I've been using Btrfs on my home directory previously as well, but did not experience similar I/O-related crashes. I'll try using the latest upstream kernel and report back. The problem is that these crashes happen randomly, I have not yet been able to reproduce them reliably (and since this always brings down my entire work environment, I'm not too excited about these crashes, as you can probably imagine). -- You received this bug notification because you are a member of Ubuntu Bugs, which is subscribed to Ubuntu. https://bugs.launchpad.net/bugs/1415510 Title: Frequent kernel panics when doing heavy I/O in LXC containers on Btrfs To manage notifications about this bug go to: https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1415510/+subscriptions -- ubuntu-bugs mailing list ubuntu-bugs@lists.ubuntu.com https://lists.ubuntu.com/mailman/listinfo/ubuntu-bugs
[Bug 1415510] Re: Frequent kernel panics when doing heavy I/O in LXC containers on Btrfs
FYI, I'm now running Kernel version 3.19.0-031900rc6-generic from the URL you mentioned in #4. Let's see if the issue persists. The file system mounts fine, but I'm concerned about the output of btrfsck: Checking filesystem on /dev/mapper/ubuntu--vg-container UUID: b95b58fb-d0b2-4735-a4ed-2033537eb89a checking extents checking free space cache free space inode generation (0) did not match free space cache generation (135494) free space inode generation (0) did not match free space cache generation (135494) There is no free space entry for 11902451712-11904450560 There is no free space entry for 11902451712-12914262016 cache appears valid but isnt 11840520192 There is no free space entry for 14083878912-14084898816 There is no free space entry for 14083878912-15061745664 cache appears valid but isnt 13988003840 There is no free space entry for 15305744384-15306821632 There is no free space entry for 15305744384-16135487488 cache appears valid but isnt 15061745664 There is no free space entry for 16377102336-16378097664 There is no free space entry for 16377102336-17209229312 cache appears valid but isnt 16135487488 Wanted bytes 1957888, found 393216 for off 17210597376 Wanted bytes 1072373760, found 393216 for off 17210597376 cache appears valid but isnt 17209229312 There is no free space entry for 22839361536-22840332288 There is no free space entry for 22839361536-23651680256 cache appears valid but isnt 22577938432 There is no free space entry for 27126554624-27128291328 There is no free space entry for 27126554624-27946647552 cache appears valid but isnt 26872905728 There is no free space entry for 27985326080-27987566592 There is no free space entry for 27985326080-29020389376 cache appears valid but isnt 27946647552 There is no free space entry for 30126866432-30127960064 There is no free space entry for 30126866432-31167873024 cache appears valid but isnt 30094131200 Wanted bytes 2195456, found 12288 for off 64471998464 Wanted bytes 1055612928, found 12288 for off 64471998464 cache appears valid but isnt 64453869568 There is no free space entry for 65631424512-65632366592 There is no free space entry for 65631424512-66601353216 cache appears valid but isnt 65527611392 free space inode generation (0) did not match free space cache generation (135494) found 15376998613 bytes used err is -22 total csum bytes: 44021104 total tree bytes: 601505792 total fs tree bytes: 419954688 total extent tree bytes: 10888 btree space waste bytes: 129058353 file data blocks allocated: 126222901248 referenced 43301400576 Btrfs v3.12 I'm not sure if the corruption is the cause or the consequence of the kernel panics, which require a hard reboot. -- You received this bug notification because you are a member of Ubuntu Bugs, which is subscribed to Ubuntu. https://bugs.launchpad.net/bugs/1415510 Title: Frequent kernel panics when doing heavy I/O in LXC containers on Btrfs To manage notifications about this bug go to: https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1415510/+subscriptions -- ubuntu-bugs mailing list ubuntu-bugs@lists.ubuntu.com https://lists.ubuntu.com/mailman/listinfo/ubuntu-bugs
[Bug 1415510] Re: Frequent kernel panics when doing heavy I/O in LXC containers on Btrfs
Looks like the repair was successful: % sudo btrfsck --repair /dev/mapper/ubuntu--vg-container enabling repair mode Checking filesystem on /dev/mapper/ubuntu--vg-container UUID: b95b58fb-d0b2-4735-a4ed-2033537eb89a checking extents checking free space cache cache and super generation don't match, space cache will be invalidated checking fs roots checking csums checking root refs found 37972963086 bytes used err is 0 total csum bytes: 44021104 total tree bytes: 651902976 total fs tree bytes: 470499328 total extent tree bytes: 103186432 btree space waste bytes: 136847949 file data blocks allocated: 126929125376 referenced 44007616512 Btrfs v3.12 I'll now proceed with testing this against the upstream kernel. -- You received this bug notification because you are a member of Ubuntu Bugs, which is subscribed to Ubuntu. https://bugs.launchpad.net/bugs/1415510 Title: Frequent kernel panics when doing heavy I/O in LXC containers on Btrfs To manage notifications about this bug go to: https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1415510/+subscriptions -- ubuntu-bugs mailing list ubuntu-bugs@lists.ubuntu.com https://lists.ubuntu.com/mailman/listinfo/ubuntu-bugs
[Bug 1415510] Re: Frequent kernel panics when doing heavy I/O in LXC containers on Btrfs
NB: Trying to dd an image of the Btrfs file system to an external USB3 disk using the mainline kernel revealed a different issue, which I reported as bug#1415859. -- You received this bug notification because you are a member of Ubuntu Bugs, which is subscribed to Ubuntu. https://bugs.launchpad.net/bugs/1415510 Title: Frequent kernel panics when doing heavy I/O in LXC containers on Btrfs To manage notifications about this bug go to: https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1415510/+subscriptions -- ubuntu-bugs mailing list ubuntu-bugs@lists.ubuntu.com https://lists.ubuntu.com/mailman/listinfo/ubuntu-bugs
[Bug 1415510] Re: Frequent kernel panics when doing heavy I/O in LXC containers on Btrfs
Update: using the mainline kernel, I observe a slightly different pattern. When running multiple heavy I/O operations in parallel (e.g. rsyncing a large ISO image to a container, performing an http upload into another one and running yum update on all containers), the large uploads start to stall and come to a crawling halt at some point. dmesg reveals some different btrfs related issues: [ 6838.005920] INFO: task kworker/u16:0:5815 blocked for more than 120 seconds. [ 6838.005924] Not tainted 3.19.0-031900rc6-generic #201501261152 [ 6838.005925] echo 0 /proc/sys/kernel/hung_task_timeout_secs disables this message. [ 6838.005926] kworker/u16:0 D 88024422bb18 0 5815 2 0x [ 6838.005953] Workqueue: events_unbound btrfs_async_reclaim_metadata_space [btrfs] [ 6838.005954] 88024422bb18 88024422bad8 88024422bfd8 000141c0 [ 6838.005956] 88030c1b0700 88021a1e13a0 8802c78a75c0 88024422bb08 [ 6838.005958] 88024422bc88 7fff 7fff 8802c78a75c0 [ 6838.005959] Call Trace: [ 6838.005965] [817cd6b9] schedule+0x29/0x70 [ 6838.005968] [817d0445] schedule_timeout+0x1b5/0x210 [ 6838.005972] [8108e01a] ? __queue_delayed_work+0xaa/0x1a0 [ 6838.005974] [8108e5db] ? try_to_grab_pending+0x4b/0x80 [ 6838.005976] [817cebc7] wait_for_completion+0xa7/0x160 [ 6838.005979] [810a3fa0] ? try_to_wake_up+0x2a0/0x2a0 [ 6838.005983] [8121d6c6] writeback_inodes_sb_nr+0x86/0xb0 [ 6838.005997] [c0630b9d] shrink_delalloc+0x10d/0x300 [btrfs] [ 6838.006011] [c0630e68] flush_space+0xd8/0x150 [btrfs] [ 6838.006022] [c063175b] btrfs_async_reclaim_metadata_space+0x14b/0x1d0 [btrfs] [ 6838.006024] [8108f6dd] process_one_work+0x14d/0x460 [ 6838.006026] [810900bb] worker_thread+0x11b/0x3f0 [ 6838.006029] [8108ffa0] ? create_worker+0x1e0/0x1e0 [ 6838.006031] [81095cc9] kthread+0xc9/0xe0 [ 6838.006032] [81095c00] ? flush_kthread_worker+0x90/0x90 [ 6838.006035] [817d17fc] ret_from_fork+0x7c/0xb0 [ 6838.006037] [81095c00] ? flush_kthread_worker+0x90/0x90 [ 6957.962660] INFO: task kworker/u16:0:5815 blocked for more than 120 seconds. [ 6957.962667] Not tainted 3.19.0-031900rc6-generic #201501261152 [ 6957.962668] echo 0 /proc/sys/kernel/hung_task_timeout_secs disables this message. [ 6957.962671] kworker/u16:0 D 88024422bb18 0 5815 2 0x [ 6957.962706] Workqueue: events_unbound btrfs_async_reclaim_metadata_space [btrfs] [ 6957.962709] 88024422bb18 88024422bad8 88024422bfd8 000141c0 [ 6957.962713] 88030c1b0700 88021a1e13a0 8802c78a75c0 88024422bb08 [ 6957.962716] 88024422bc88 7fff 7fff 8802c78a75c0 [ 6957.962720] Call Trace: [ 6957.962741] [817cd6b9] schedule+0x29/0x70 [ 6957.962746] [817d0445] schedule_timeout+0x1b5/0x210 [ 6957.962752] [8108e01a] ? __queue_delayed_work+0xaa/0x1a0 [ 6957.962756] [8108e5db] ? try_to_grab_pending+0x4b/0x80 [ 6957.962760] [817cebc7] wait_for_completion+0xa7/0x160 [ 6957.962765] [810a3fa0] ? try_to_wake_up+0x2a0/0x2a0 [ 6957.962771] [8121d6c6] writeback_inodes_sb_nr+0x86/0xb0 [ 6957.962787] [c0630b9d] shrink_delalloc+0x10d/0x300 [btrfs] [ 6957.962803] [c0630e68] flush_space+0xd8/0x150 [btrfs] [ 6957.962817] [c063175b] btrfs_async_reclaim_metadata_space+0x14b/0x1d0 [btrfs] [ 6957.962822] [8108f6dd] process_one_work+0x14d/0x460 [ 6957.962826] [810900bb] worker_thread+0x11b/0x3f0 [ 6957.962830] [8108ffa0] ? create_worker+0x1e0/0x1e0 [ 6957.962834] [81095cc9] kthread+0xc9/0xe0 [ 6957.962838] [81095c00] ? flush_kthread_worker+0x90/0x90 [ 6957.962842] [817d17fc] ret_from_fork+0x7c/0xb0 [ 6957.962846] [81095c00] ? flush_kthread_worker+0x90/0x90 [ 6962.761961] systemd-hostnamed[15586]: Warning: nss-myhostname is not installed. Changing the local hostname might make it unresolveable. Please install nss-myhostname! [ 7437.789596] INFO: task yum:14547 blocked for more than 120 seconds. [ 7437.789600] Not tainted 3.19.0-031900rc6-generic #201501261152 [ 7437.789601] echo 0 /proc/sys/kernel/hung_task_timeout_secs disables this message. [ 7437.789602] yum D 880286777868 0 14547 14546 0x [ 7437.789605] 880286777868 00020001 880286777fd8 000141c0 [ 7437.789607] 88002e07db00 81c1c500 8801f8892740 880286777858 [ 7437.789608] 8802867779d8 7fff 7fff 8801f8892740 [ 7437.789610] Call Trace: [ 7437.789616] [817cd6b9] schedule+0x29/0x70 [ 7437.789619] [817d0445] schedule_timeout+0x1b5/0x210 [ 7437.789623] [8108e01a] ? __queue_delayed_work+0xaa/0x1a0 [ 7437.789625] [8108e5db] ?
[Bug 1415510] Re: Frequent kernel panics when doing heavy I/O in LXC containers on Btrfs
Attaching a second screen shot with a slightly different stack trace ** Attachment added: IMG_20150126_095337.jpg https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1415510/+attachment/4307319/+files/IMG_20150126_095337.jpg -- You received this bug notification because you are a member of Ubuntu Bugs, which is subscribed to Ubuntu. https://bugs.launchpad.net/bugs/1415510 Title: Frequent kernel panics when doing heavy I/O in LXC containers on Btrfs To manage notifications about this bug go to: https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1415510/+subscriptions -- ubuntu-bugs mailing list ubuntu-bugs@lists.ubuntu.com https://lists.ubuntu.com/mailman/listinfo/ubuntu-bugs
[Bug 1415510] Re: Frequent kernel panics when doing heavy I/O in LXC containers on Btrfs
Did this issue start happening after an update/upgrade? Was there a prior kernel version where you were not having this particular problem? Would it be possible for you to test the latest upstream kernel? Refer to https://wiki.ubuntu.com/KernelMainlineBuilds . Please test the latest v3.19 kernel[0]. If this bug is fixed in the mainline kernel, please add the following tag 'kernel-fixed-upstream'. If the mainline kernel does not fix this bug, please add the tag: 'kernel-bug-exists-upstream'. If you are unable to test the mainline kernel, for example it will not boot, please add the tag: 'kernel-unable-to-test-upstream'. Once testing of the upstream kernel is complete, please mark this bug as Confirmed. Thanks in advance. [0] http://kernel.ubuntu.com/~kernel-ppa/mainline/v3.19-rc6-vivid/ ** Changed in: linux (Ubuntu) Importance: Undecided = High ** Tags added: kernel-key -- You received this bug notification because you are a member of Ubuntu Bugs, which is subscribed to Ubuntu. https://bugs.launchpad.net/bugs/1415510 Title: Frequent kernel panics when doing heavy I/O in LXC containers on Btrfs To manage notifications about this bug go to: https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1415510/+subscriptions -- ubuntu-bugs mailing list ubuntu-bugs@lists.ubuntu.com https://lists.ubuntu.com/mailman/listinfo/ubuntu-bugs