[Bug 1415510] Re: Frequent kernel panics when doing heavy I/O in LXC containers on Btrfs

2015-02-03 Thread Joseph Salisbury
** Tags removed: kernel-key
** Tags added: kernel-da-key

-- 
You received this bug notification because you are a member of Ubuntu
Bugs, which is subscribed to Ubuntu.
https://bugs.launchpad.net/bugs/1415510

Title:
  Frequent kernel panics when doing heavy I/O in LXC containers on Btrfs

To manage notifications about this bug go to:
https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1415510/+subscriptions

-- 
ubuntu-bugs mailing list
ubuntu-bugs@lists.ubuntu.com
https://lists.ubuntu.com/mailman/listinfo/ubuntu-bugs


[Bug 1415510] Re: Frequent kernel panics when doing heavy I/O in LXC containers on Btrfs

2015-01-29 Thread Lenz Grimmer
Hi Joseph, thanks for your quick reply. The problems began some time
after I started using LXC with Btrfs on Trusty, but I'm not sure if a
particular kernel update introduced a regression. I've been using Btrfs
on my home directory previously as well, but did not experience similar
I/O-related crashes.

I'll try using the latest upstream kernel and report back. The problem
is that these crashes happen randomly, I have not yet been able to
reproduce them reliably (and since this always brings down my entire
work environment, I'm not too excited about these crashes, as you can
probably imagine).

-- 
You received this bug notification because you are a member of Ubuntu
Bugs, which is subscribed to Ubuntu.
https://bugs.launchpad.net/bugs/1415510

Title:
  Frequent kernel panics when doing heavy I/O in LXC containers on Btrfs

To manage notifications about this bug go to:
https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1415510/+subscriptions

-- 
ubuntu-bugs mailing list
ubuntu-bugs@lists.ubuntu.com
https://lists.ubuntu.com/mailman/listinfo/ubuntu-bugs


[Bug 1415510] Re: Frequent kernel panics when doing heavy I/O in LXC containers on Btrfs

2015-01-29 Thread Lenz Grimmer
FYI, I'm now running Kernel version 3.19.0-031900rc6-generic from the
URL you mentioned in #4. Let's see if the issue persists.

The file system mounts fine, but I'm concerned about the output of
btrfsck:

Checking filesystem on /dev/mapper/ubuntu--vg-container
UUID: b95b58fb-d0b2-4735-a4ed-2033537eb89a
checking extents
checking free space cache
free space inode generation (0) did not match free space cache generation 
(135494)
free space inode generation (0) did not match free space cache generation 
(135494)
There is no free space entry for 11902451712-11904450560
There is no free space entry for 11902451712-12914262016
cache appears valid but isnt 11840520192
There is no free space entry for 14083878912-14084898816
There is no free space entry for 14083878912-15061745664
cache appears valid but isnt 13988003840
There is no free space entry for 15305744384-15306821632
There is no free space entry for 15305744384-16135487488
cache appears valid but isnt 15061745664
There is no free space entry for 16377102336-16378097664
There is no free space entry for 16377102336-17209229312
cache appears valid but isnt 16135487488
Wanted bytes 1957888, found 393216 for off 17210597376
Wanted bytes 1072373760, found 393216 for off 17210597376
cache appears valid but isnt 17209229312
There is no free space entry for 22839361536-22840332288
There is no free space entry for 22839361536-23651680256
cache appears valid but isnt 22577938432
There is no free space entry for 27126554624-27128291328
There is no free space entry for 27126554624-27946647552
cache appears valid but isnt 26872905728
There is no free space entry for 27985326080-27987566592
There is no free space entry for 27985326080-29020389376
cache appears valid but isnt 27946647552
There is no free space entry for 30126866432-30127960064
There is no free space entry for 30126866432-31167873024
cache appears valid but isnt 30094131200
Wanted bytes 2195456, found 12288 for off 64471998464
Wanted bytes 1055612928, found 12288 for off 64471998464
cache appears valid but isnt 64453869568
There is no free space entry for 65631424512-65632366592
There is no free space entry for 65631424512-66601353216
cache appears valid but isnt 65527611392
free space inode generation (0) did not match free space cache generation 
(135494)
found 15376998613 bytes used err is -22
total csum bytes: 44021104
total tree bytes: 601505792
total fs tree bytes: 419954688
total extent tree bytes: 10888
btree space waste bytes: 129058353
file data blocks allocated: 126222901248
 referenced 43301400576
Btrfs v3.12

I'm not sure if the corruption is the cause or the consequence of the
kernel panics, which require a hard reboot.

-- 
You received this bug notification because you are a member of Ubuntu
Bugs, which is subscribed to Ubuntu.
https://bugs.launchpad.net/bugs/1415510

Title:
  Frequent kernel panics when doing heavy I/O in LXC containers on Btrfs

To manage notifications about this bug go to:
https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1415510/+subscriptions

-- 
ubuntu-bugs mailing list
ubuntu-bugs@lists.ubuntu.com
https://lists.ubuntu.com/mailman/listinfo/ubuntu-bugs


[Bug 1415510] Re: Frequent kernel panics when doing heavy I/O in LXC containers on Btrfs

2015-01-29 Thread Lenz Grimmer
Looks like the repair was successful:

% sudo btrfsck --repair /dev/mapper/ubuntu--vg-container
enabling repair mode
Checking filesystem on /dev/mapper/ubuntu--vg-container
UUID: b95b58fb-d0b2-4735-a4ed-2033537eb89a
checking extents
checking free space cache
cache and super generation don't match, space cache will be invalidated
checking fs roots
checking csums
checking root refs
found 37972963086 bytes used err is 0
total csum bytes: 44021104
total tree bytes: 651902976
total fs tree bytes: 470499328
total extent tree bytes: 103186432
btree space waste bytes: 136847949
file data blocks allocated: 126929125376
 referenced 44007616512
Btrfs v3.12


I'll now proceed with testing this against the upstream kernel.

-- 
You received this bug notification because you are a member of Ubuntu
Bugs, which is subscribed to Ubuntu.
https://bugs.launchpad.net/bugs/1415510

Title:
  Frequent kernel panics when doing heavy I/O in LXC containers on Btrfs

To manage notifications about this bug go to:
https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1415510/+subscriptions

-- 
ubuntu-bugs mailing list
ubuntu-bugs@lists.ubuntu.com
https://lists.ubuntu.com/mailman/listinfo/ubuntu-bugs


[Bug 1415510] Re: Frequent kernel panics when doing heavy I/O in LXC containers on Btrfs

2015-01-29 Thread Lenz Grimmer
NB: Trying to dd an image of the Btrfs file system to an external USB3
disk using the mainline kernel revealed a different issue, which I
reported as bug#1415859.

-- 
You received this bug notification because you are a member of Ubuntu
Bugs, which is subscribed to Ubuntu.
https://bugs.launchpad.net/bugs/1415510

Title:
  Frequent kernel panics when doing heavy I/O in LXC containers on Btrfs

To manage notifications about this bug go to:
https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1415510/+subscriptions

-- 
ubuntu-bugs mailing list
ubuntu-bugs@lists.ubuntu.com
https://lists.ubuntu.com/mailman/listinfo/ubuntu-bugs


[Bug 1415510] Re: Frequent kernel panics when doing heavy I/O in LXC containers on Btrfs

2015-01-29 Thread Lenz Grimmer
Update: using the mainline kernel, I observe a slightly different
pattern. When running multiple heavy I/O operations in parallel (e.g.
rsyncing a large ISO image to a container, performing an http upload
into another one and running yum update on all containers), the large
uploads start to stall and come to a crawling halt at some point.

dmesg reveals some different btrfs related issues:

[ 6838.005920] INFO: task kworker/u16:0:5815 blocked for more than 120 seconds.
[ 6838.005924]   Not tainted 3.19.0-031900rc6-generic #201501261152
[ 6838.005925] echo 0  /proc/sys/kernel/hung_task_timeout_secs disables this 
message.
[ 6838.005926] kworker/u16:0   D 88024422bb18 0  5815  2 0x
[ 6838.005953] Workqueue: events_unbound btrfs_async_reclaim_metadata_space 
[btrfs]
[ 6838.005954]  88024422bb18 88024422bad8 88024422bfd8 
000141c0
[ 6838.005956]  88030c1b0700 88021a1e13a0 8802c78a75c0 
88024422bb08
[ 6838.005958]  88024422bc88 7fff 7fff 
8802c78a75c0
[ 6838.005959] Call Trace:
[ 6838.005965]  [817cd6b9] schedule+0x29/0x70
[ 6838.005968]  [817d0445] schedule_timeout+0x1b5/0x210
[ 6838.005972]  [8108e01a] ? __queue_delayed_work+0xaa/0x1a0
[ 6838.005974]  [8108e5db] ? try_to_grab_pending+0x4b/0x80
[ 6838.005976]  [817cebc7] wait_for_completion+0xa7/0x160
[ 6838.005979]  [810a3fa0] ? try_to_wake_up+0x2a0/0x2a0
[ 6838.005983]  [8121d6c6] writeback_inodes_sb_nr+0x86/0xb0
[ 6838.005997]  [c0630b9d] shrink_delalloc+0x10d/0x300 [btrfs]
[ 6838.006011]  [c0630e68] flush_space+0xd8/0x150 [btrfs]
[ 6838.006022]  [c063175b] 
btrfs_async_reclaim_metadata_space+0x14b/0x1d0 [btrfs]
[ 6838.006024]  [8108f6dd] process_one_work+0x14d/0x460
[ 6838.006026]  [810900bb] worker_thread+0x11b/0x3f0
[ 6838.006029]  [8108ffa0] ? create_worker+0x1e0/0x1e0
[ 6838.006031]  [81095cc9] kthread+0xc9/0xe0
[ 6838.006032]  [81095c00] ? flush_kthread_worker+0x90/0x90
[ 6838.006035]  [817d17fc] ret_from_fork+0x7c/0xb0
[ 6838.006037]  [81095c00] ? flush_kthread_worker+0x90/0x90
[ 6957.962660] INFO: task kworker/u16:0:5815 blocked for more than 120 seconds.
[ 6957.962667]   Not tainted 3.19.0-031900rc6-generic #201501261152
[ 6957.962668] echo 0  /proc/sys/kernel/hung_task_timeout_secs disables this 
message.
[ 6957.962671] kworker/u16:0   D 88024422bb18 0  5815  2 0x
[ 6957.962706] Workqueue: events_unbound btrfs_async_reclaim_metadata_space 
[btrfs]
[ 6957.962709]  88024422bb18 88024422bad8 88024422bfd8 
000141c0
[ 6957.962713]  88030c1b0700 88021a1e13a0 8802c78a75c0 
88024422bb08
[ 6957.962716]  88024422bc88 7fff 7fff 
8802c78a75c0
[ 6957.962720] Call Trace:
[ 6957.962741]  [817cd6b9] schedule+0x29/0x70
[ 6957.962746]  [817d0445] schedule_timeout+0x1b5/0x210
[ 6957.962752]  [8108e01a] ? __queue_delayed_work+0xaa/0x1a0
[ 6957.962756]  [8108e5db] ? try_to_grab_pending+0x4b/0x80
[ 6957.962760]  [817cebc7] wait_for_completion+0xa7/0x160
[ 6957.962765]  [810a3fa0] ? try_to_wake_up+0x2a0/0x2a0
[ 6957.962771]  [8121d6c6] writeback_inodes_sb_nr+0x86/0xb0
[ 6957.962787]  [c0630b9d] shrink_delalloc+0x10d/0x300 [btrfs]
[ 6957.962803]  [c0630e68] flush_space+0xd8/0x150 [btrfs]
[ 6957.962817]  [c063175b] 
btrfs_async_reclaim_metadata_space+0x14b/0x1d0 [btrfs]
[ 6957.962822]  [8108f6dd] process_one_work+0x14d/0x460
[ 6957.962826]  [810900bb] worker_thread+0x11b/0x3f0
[ 6957.962830]  [8108ffa0] ? create_worker+0x1e0/0x1e0
[ 6957.962834]  [81095cc9] kthread+0xc9/0xe0
[ 6957.962838]  [81095c00] ? flush_kthread_worker+0x90/0x90
[ 6957.962842]  [817d17fc] ret_from_fork+0x7c/0xb0
[ 6957.962846]  [81095c00] ? flush_kthread_worker+0x90/0x90
[ 6962.761961] systemd-hostnamed[15586]: Warning: nss-myhostname is not 
installed. Changing the local hostname might make it unresolveable. Please 
install nss-myhostname!
[ 7437.789596] INFO: task yum:14547 blocked for more than 120 seconds.
[ 7437.789600]   Not tainted 3.19.0-031900rc6-generic #201501261152
[ 7437.789601] echo 0  /proc/sys/kernel/hung_task_timeout_secs disables this 
message.
[ 7437.789602] yum D 880286777868 0 14547  14546 0x
[ 7437.789605]  880286777868 00020001 880286777fd8 
000141c0
[ 7437.789607]  88002e07db00 81c1c500 8801f8892740 
880286777858
[ 7437.789608]  8802867779d8 7fff 7fff 
8801f8892740
[ 7437.789610] Call Trace:
[ 7437.789616]  [817cd6b9] schedule+0x29/0x70
[ 7437.789619]  [817d0445] schedule_timeout+0x1b5/0x210
[ 7437.789623]  [8108e01a] ? __queue_delayed_work+0xaa/0x1a0
[ 7437.789625]  [8108e5db] ? 

[Bug 1415510] Re: Frequent kernel panics when doing heavy I/O in LXC containers on Btrfs

2015-01-28 Thread Lenz Grimmer
Attaching a second screen shot with a slightly different stack trace

** Attachment added: IMG_20150126_095337.jpg
   
https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1415510/+attachment/4307319/+files/IMG_20150126_095337.jpg

-- 
You received this bug notification because you are a member of Ubuntu
Bugs, which is subscribed to Ubuntu.
https://bugs.launchpad.net/bugs/1415510

Title:
  Frequent kernel panics when doing heavy I/O in LXC containers on Btrfs

To manage notifications about this bug go to:
https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1415510/+subscriptions

-- 
ubuntu-bugs mailing list
ubuntu-bugs@lists.ubuntu.com
https://lists.ubuntu.com/mailman/listinfo/ubuntu-bugs


[Bug 1415510] Re: Frequent kernel panics when doing heavy I/O in LXC containers on Btrfs

2015-01-28 Thread Joseph Salisbury
Did this issue start happening after an update/upgrade?  Was there a
prior kernel version where you were not having this particular problem?

Would it be possible for you to test the latest upstream kernel? Refer
to https://wiki.ubuntu.com/KernelMainlineBuilds . Please test the latest
v3.19 kernel[0].

If this bug is fixed in the mainline kernel, please add the following
tag 'kernel-fixed-upstream'.

If the mainline kernel does not fix this bug, please add the tag:
'kernel-bug-exists-upstream'.

If you are unable to test the mainline kernel, for example it will not boot, 
please add the tag: 'kernel-unable-to-test-upstream'.
Once testing of the upstream kernel is complete, please mark this bug as 
Confirmed.


Thanks in advance.

[0] http://kernel.ubuntu.com/~kernel-ppa/mainline/v3.19-rc6-vivid/

** Changed in: linux (Ubuntu)
   Importance: Undecided = High

** Tags added: kernel-key

-- 
You received this bug notification because you are a member of Ubuntu
Bugs, which is subscribed to Ubuntu.
https://bugs.launchpad.net/bugs/1415510

Title:
  Frequent kernel panics when doing heavy I/O in LXC containers on Btrfs

To manage notifications about this bug go to:
https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1415510/+subscriptions

-- 
ubuntu-bugs mailing list
ubuntu-bugs@lists.ubuntu.com
https://lists.ubuntu.com/mailman/listinfo/ubuntu-bugs