This is most probably related to the same regression seen after 2.6.38, my blocked comment on 3 August included an indication to that the behavior was present in my distro 2.6.38 kernel too, it just was appearing after a considerably longer uptime (on my desktop system using btrfs as rootfs on an Intel ICH10 driven SATA HDD).
I have reverted my / to ext4 since, and I'm okay with it, although I would be very happy to see some improvement on this serious-for-me issue. Btrfs slowdown news://news.gmane.org:119/cao47_-9blkwugdeuzalqhsq9tzkauao8fmqey1ppk9a2hb+...@mail.gmail.com Also, a patch by Josef Bacik was an attempt for fixing this, but no one reported about testing it on an affected system, it did not eliminate the slowdowns for me: PLEASE TEST: Everybody who is seeing weird and long hangs news://news.gmane.org:119/4e36c47e.70...@redhat.com My comment was going as an aswer to Mck's post in "Btrfs slowdown" thread, where I reported about this in a little more detail - but it never appeared on the list. I try including it now: ________________________________________________________________________ I'm confirming this too. Following advices given on #btrfs irc, I have applied Josef's second patch for fs/btrfs/extent_io.c and I'm reporting that it did NOT make the slowdowns disappear on 3.0 kernels (even with some rather different configs). The HDD thrashing appeared on all other kernel versions I tried, higher than 2.6.37. Initially, I had been into looking for a latest known good kernel (to prepare a proper git bisect as cwillu advised) and at first I also felt like 2.6.38 does not show this miserable behaviour. But later it turned out this was only for approximately 2 days of uptime. Given enough time, the lock-ups appeared on 2.6.38 too. Although they were not that apparent than on later kernel versions, and the individual lockups took much less time with 2.6.38 running for 2 days (binary Sabayon Linux repository kernel). My HDD, with btrfs as / on it emits very distinct (and loud enough) noises with a slightly different character for reads and writes - and I can actually hear the disk's repetitive seek pattern during a such thrashing period. Based on that, I guess it must be the exact same thing happening with 2.6.38 as with later kernels because they sound very similar. They last much shorter but they have a similarly repetitive seeking nature with other I/O severely throttled and I believe it is write what is mostly what's happening during a lockup. So I concluded that I failed to identify a known good version so far. I didn't have time to get into earlier kernels than .38. (Tried .37, but for too brief of uptime to claim they did not appear when I was on .37) Similar with my current kernel. It started happening after about 12 hours of running the machine using # uname -a Linux insula 3.0.0-git15genseed #2 SMP PREEMPT Tue Aug 2 20:10:05 CEST 2011 x86_64 Intel(R) Core(TM)2 Duo CPU E4500 @ 2.20GHz GenuineIntel GNU/Linux As appended string reflects, it is a custom kernel, it has Josef's patch applied with the config attached.Tried to patch my distro's 3.0 kernel, no change was experienced with regards to the issue (iirc it was even a lot worse). Let me know if I can contribute with anything that would be valuable for the developers towards elimination of this very nasty bug. Now, after 23 hours of uptime, my PC has become almost unusable. Currently there's about 8 seconds thrashing, 10 seconds not thrashing, and during thrashing, all other (disk) I/O is practically blocked. SysRq+W under thrashing (dunno how informative it is, but here's one): [62279.779382] SysRq : Show Blocked State [62279.779389] task PC stack pid father [62279.779404] btrfs-submit-0 D 0000000000000000 5616 4678 2 0x00000000 [62279.779413] ffff88012b1370d0 0000000000000046 ffff880100000000 ffffffff8182c020 [62279.779422] ffff880128d39fd8 0000000000010480 0000000000004000 ffff880128d38000 [62279.779429] ffff880128d39fd8 0000000000010480 ffff88012b1370d0 0000000000010480 [62279.779437] Call Trace: [62279.779449] [<ffffffff812779c6>] ? cfq_set_request+0x33e/0x37e [62279.779456] [<ffffffff81277063>] ? cfq_cic_lookup+0x35/0x139 [62279.779462] [<ffffffff812773a2>] ? cfq_may_queue+0x51/0x6e [62279.779470] [<ffffffff8143ed81>] ? io_schedule+0x4e/0x63 [62279.779477] [<ffffffff8126b276>] ? get_request_wait+0xaa/0x10e [62279.779484] [<ffffffff8104f2ad>] ? wake_up_bit+0x23/0x23 [62279.779490] [<ffffffff8126c2a6>] ? __make_request+0x175/0x26b [62279.779496] [<ffffffff8126a267>] ? generic_make_request+0x224/0x289 [62279.779502] [<ffffffff8126a37f>] ? submit_bio+0xb3/0xbc [62279.779509] [<ffffffff81372238>] ? dm_any_congested+0x4f/0x57 [62279.779516] [<ffffffff81206de6>] ? run_scheduled_bios+0x246/0x3b1 [62279.779523] [<ffffffff8120c791>] ? worker_loop+0x180/0x4bb [62279.779529] [<ffffffff8120c611>] ? btrfs_queue_worker+0x24e/0x24e [62279.779535] [<ffffffff8104eee7>] ? kthread+0x7a/0x82 [62279.779542] [<ffffffff81442554>] ? kernel_thread_helper+0x4/0x10 [62279.779548] [<ffffffff8104ee6d>] ? kthread_worker_fn+0x149/0x149 [62279.779554] [<ffffffff81442550>] ? gs_change+0xb/0xb [62279.779560] btrfs-transacti D 0000000000000001 3856 4689 2 0x00000000 [62279.779568] ffff88012b205320 0000000000000046 0000000000000000 ffff88012b06d320 [62279.779576] ffff880128d97fd8 0000000000010480 0000000000004000 ffff880128d96000 [62279.779583] ffff880128d97fd8 0000000000010480 ffff88012b205320 0000000000010480 [62279.779591] Call Trace: [62279.779597] [<ffffffff8120152f>] ? alloc_extent_state+0x12/0x55 [62279.779605] [<ffffffff810aefbe>] ? kmem_cache_free+0x87/0x8e [62279.779611] [<ffffffff8127e2ab>] ? rb_erase+0x134/0x26f [62279.779617] [<ffffffff81081326>] ? __lock_page+0x63/0x63 [62279.779622] [<ffffffff8143ed81>] ? io_schedule+0x4e/0x63 [62279.779628] [<ffffffff8108132f>] ? sleep_on_page+0x9/0x10 [62279.779633] [<ffffffff81081326>] ? __lock_page+0x63/0x63 [62279.779638] [<ffffffff8143f36c>] ? __wait_on_bit+0x3e/0x71 [62279.779644] [<ffffffff810814c9>] ? wait_on_page_bit+0x6a/0x70 [62279.779650] [<ffffffff8104f2d7>] ? autoremove_wake_function+0x2a/0x2a [62279.779657] [<ffffffff811ebb53>] ? btrfs_wait_marked_extents+0xf5/0x12f [62279.779664] [<ffffffff811ebbb6>] ? btrfs_write_and_wait_marked_extents+0x29/0x3d [62279.779670] [<ffffffff811ec2b0>] ? btrfs_commit_transaction+0x5c7/0x6e8 [62279.779677] [<ffffffff810433c4>] ? del_timer_sync+0x34/0x3e [62279.779682] [<ffffffff8143f1bd>] ? schedule_timeout+0x182/0x1a0 [62279.779688] [<ffffffff8104f2ad>] ? wake_up_bit+0x23/0x23 [62279.779694] [<ffffffff811ec801>] ? start_transaction+0x1e0/0x21a [62279.779700] [<ffffffff811e66c4>] ? transaction_kthread+0x180/0x238 [62279.779706] [<ffffffff811e6544>] ? btrfs_congested_fn+0x87/0x87 [62279.779712] [<ffffffff811e6544>] ? btrfs_congested_fn+0x87/0x87 [62279.779718] [<ffffffff8104eee7>] ? kthread+0x7a/0x82 [62279.779724] [<ffffffff81442554>] ? kernel_thread_helper+0x4/0x10 [62279.779730] [<ffffffff8104ee6d>] ? kthread_worker_fn+0x149/0x149 [62279.779736] [<ffffffff81442550>] ? gs_change+0xb/0xb [62279.779759] btrfs-endio-wri D 0000000000000000 4208 11320 2 0x00000000 [62279.779767] ffff88012b173570 0000000000000046 0000000000000000 ffffffff8182c020 [62279.779775] ffff88011afa9fd8 0000000000010480 0000000000004000 ffff88011afa8000 [62279.779782] ffff88011afa9fd8 0000000000010480 ffff88012b173570 0000000000010480 [62279.779789] Call Trace: [62279.779796] [<ffffffff8126a267>] ? generic_make_request+0x224/0x289 [62279.779802] [<ffffffff811faaeb>] ? lookup_extent_mapping+0x37/0xb3 [62279.779808] [<ffffffff81081326>] ? __lock_page+0x63/0x63 [62279.779813] [<ffffffff8143ed81>] ? io_schedule+0x4e/0x63 [62279.779818] [<ffffffff8108132f>] ? sleep_on_page+0x9/0x10 [62279.779823] [<ffffffff81081326>] ? __lock_page+0x63/0x63 [62279.779828] [<ffffffff8143f36c>] ? __wait_on_bit+0x3e/0x71 [62279.779834] [<ffffffff810814c9>] ? wait_on_page_bit+0x6a/0x70 [62279.779840] [<ffffffff8104f2d7>] ? autoremove_wake_function+0x2a/0x2a [62279.779846] [<ffffffff81205835>] ? read_extent_buffer_pages+0x318/0x39b [62279.779852] [<ffffffff811e5a9e>] ? verify_parent_transid+0x1d9/0x1d9 [62279.779859] [<ffffffff811e6c95>] ? btree_read_extent_buffer_pages.clone.66+0x58/0xb2 [62279.779865] [<ffffffff811e78b7>] ? read_tree_block+0x31/0x44 [62279.779871] [<ffffffff811d1a8a>] ? read_block_for_search.clone.41+0x309/0x33f [62279.779878] [<ffffffff812115fa>] ? btrfs_tree_read_unlock+0x9/0x33 [62279.779884] [<ffffffff811cd235>] ? unlock_up+0x114/0x140 [62279.779890] [<ffffffff811d4203>] ? btrfs_search_slot+0x7e7/0xa5e [62279.779897] [<ffffffff811d54fc>] ? btrfs_insert_empty_items+0x62/0xb3 [62279.779904] [<ffffffff811da616>] ? alloc_reserved_file_extent.clone.68+0x9b/0x213 [62279.779911] [<ffffffff811dd08c>] ? run_clustered_refs+0x61f/0x70b [62279.779918] [<ffffffff811dd241>] ? btrfs_run_delayed_refs+0xc9/0x1cd [62279.779924] [<ffffffff811ec46f>] ? __btrfs_end_transaction+0x83/0x1e2 [62279.779931] [<ffffffff811f171d>] ? btrfs_finish_ordered_io+0x280/0x2a5 [62279.779937] [<ffffffff81202316>] ? end_bio_extent_writepage+0xa0/0x14a [62279.779943] [<ffffffff8120c791>] ? worker_loop+0x180/0x4bb [62279.779949] [<ffffffff8120c611>] ? btrfs_queue_worker+0x24e/0x24e [62279.779955] [<ffffffff8104eee7>] ? kthread+0x7a/0x82 [62279.779962] [<ffffffff81442554>] ? kernel_thread_helper+0x4/0x10 [62279.779968] [<ffffffff8104ee6d>] ? kthread_worker_fn+0x149/0x149 [62279.779974] [<ffffffff81442550>] ? gs_change+0xb/0xb # mount | grep btrfs /dev/mapper/vg0-rootvol on / type btrfs (rw,relatime) Thanks for all efforts. -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html