https://bugzilla.kernel.org/show_bug.cgi?id=216368
--- Comment #6 from David Sterba (dste...@suse.com) --- (In reply to Christophe Leroy from comment #3) > This happens when you get an IRQ while being deep into BTRFS handling it > seems. > > It should be investigated with BTRFS team why the callstack is so deep. There's nothing strange on the call stack, contains all functions that are expected when handling a page fault, looking up internal structures and then passing to block layer to get the bytes from the device. Deep stack, measured by number of functions and also size is normal for filesystrems and we try to keep the size sane, so far we haven't seen such problems on x86_64, the overall stack size is 16K an on debug kernel there are about 6K consumed at maximum (reported by CONFIG_DEBUG_STACK_USAGE=y and CONFIG_SCHED_STACK_END_CHECK=y), the lowest value found I see in my logs is 10576. That is on a simple IO stack, ie. what's below the filesystem, as you can see the mblk-mq, NVMe and DMA also cut some stack space, but this also does not seem suspicious. What could be significant is layering with MD, device mapper, NFS, networking. The first number in the stack trace is the stack pointer, calculating what btrfs itself takes: [c000000019da5cf0] [c0000000004d66b4] .btrfs_submit_bio+0x274/0x5c0 [c000000019da5e00] [c000000000481f44] .btrfs_submit_metadata_bio+0x54/0x110 [c000000019da5e80] [c0000000004bd828] .submit_one_bio+0xb8/0x130 [c000000019da5f00] [c0000000004c84b0] .read_extent_buffer_pages+0x310/0x750 [c000000019da6020] [c000000000481b48] .btrfs_read_extent_buffer+0xd8/0x1b0 [c000000019da60f0] [c00000000048208c] .read_tree_block+0x5c/0x130 [c000000019da6190] [c0000000004609a8] .read_block_for_search+0x2c8/0x410 [c000000019da62b0] [c000000000466a30] .btrfs_search_slot+0x380/0xcf0 [c000000019da6400] [c00000000047adf4] .btrfs_lookup_csum+0x64/0x1d0 [c000000019da64d0] [c00000000047b754] .btrfs_lookup_bio_sums+0x274/0x6e0 [c000000019da6630] [c000000000505d18] .btrfs_submit_compressed_read+0x3b8/0x520 [c000000019da6720] [c0000000004954b4] .btrfs_submit_data_read_bio+0xc4/0xe0 [c000000019da67b0] [c0000000004bd7fc] .submit_one_bio+0x8c/0x130 [c000000019da6830] [c0000000004c4478] .submit_extent_page+0x548/0x590 [c000000019da6980] [c0000000004c4f80] .btrfs_do_readpage+0x330/0x970 [c000000019da6ad0] [c0000000004c67f4] .extent_readahead+0x2b4/0x430 [c000000019da6c70] [c000000000490440] .btrfs_readahead+0x10/0x30 0xc000000019da6c70 - 0xc000000019da5cf0 = 3968 That's on par with my expectation. Total stack space is (from syscall to the irq handler): 0xc000000019da7e10 - 0xc000000019da4c00 = 12816 That's getting close to 16K but still a few kilobytes before overflow, the IRQ has it's own stack (that needs to be set up from the kthread/process context). As you mention KASAN, that can add some stack consumption due to padding and alignment, but so far I don't know what exactly is the warning measuring. Calculating back from do_IRQ by 3072 it's roughly 0xc000000019da5910, inside blk_mq_flush_plug_list. I remember some build reports from PPC that due to a different compiler used the function inlining caused increased stack space consumption (eg. due to aggressive optimizations that unrolled loops too much using several additional temporary variables). So that should be investigated too before blaming btrfs. -- You may reply to this email to add a comment. You are receiving this mail because: You are watching the assignee of the bug.