On Mon, Oct 31, 2016 at 01:44:55PM -0600, Chris Mason wrote: > On Mon, Oct 31, 2016 at 12:35:16PM -0700, Linus Torvalds wrote: > >On Mon, Oct 31, 2016 at 11:55 AM, Dave Jones <da...@codemonkey.org.uk> > >wrote: > >> > >> BUG: Bad page state in process kworker/u8:12 pfn:4e0e39 > >> page:ffffea0013838e40 count:0 mapcount:0 mapping:ffff8804a20310e0 > >> index:0x100c > >> flags: 0x400000000000000c(referenced|uptodate) > >> page dumped because: non-NULL mapping > > > >Hmm. So this seems to be btrfs-specific, right? > > > >I searched for all your "non-NULL mapping" cases, and they all seem to > >have basically the same call trace, with some work thread doing > >writeback and going through btrfs_writepages(). > > > >Sounds like it's a race with either fallocate hole-punching or > >truncate. I'm not seeing it, but I suspect it's btrfs, since DaveJ > >clearly ran other filesystems too but I am not seeing this backtrace > >for anything else. > > Agreed, I think this is a separate bug, almost certainly btrfs specific. > I'll work with Dave on a better reproducer.
Ok, let's try this.. Here's the usual dump, from the current tree: [ 317.689216] BUG: Bad page state in process kworker/u8:8 pfn:4d8fd4 [ 317.702235] page:ffffea001363f500 count:0 mapcount:0 mapping:ffff8804eef8e0e0 index:0x29 [ 317.718068] flags: 0x400000000000000c(referenced|uptodate) [ 317.731086] page dumped because: non-NULL mapping [ 317.744118] CPU: 3 PID: 1558 Comm: kworker/u8:8 Not tainted 4.9.0-rc6-think+ #1 [ 317.770404] Workqueue: writeback wb_workfn [ 317.783490] (flush-btrfs-3) [ 317.783581] ffffc90000ff3798 [ 317.796651] ffffffff813b69bc [ 317.796742] ffffea001363f500 [ 317.796798] ffffffff81c474ee [ 317.796854] ffffc90000ff37c0 [ 317.809910] ffffffff811b30c4 [ 317.810001] 0000000000000000 [ 317.810055] ffffea001363f500 [ 317.810112] 0000000000000003 [ 317.823230] ffffc90000ff37d0 [ 317.823320] ffffffff811b318f [ 317.823376] ffffc90000ff3818 [ 317.823431] Call Trace: [ 317.836645] [<ffffffff813b69bc>] dump_stack+0x4f/0x73 [ 317.850063] [<ffffffff811b30c4>] bad_page+0xc4/0x130 [ 317.863426] [<ffffffff811b318f>] free_pages_check_bad+0x5f/0x70 [ 317.876805] [<ffffffff811b61f5>] free_hot_cold_page+0x305/0x3b0 [ 317.890132] [<ffffffff811b660e>] free_hot_cold_page_list+0x7e/0x170 [ 317.903410] [<ffffffff811c08b7>] release_pages+0x2e7/0x3a0 [ 317.916600] [<ffffffff811c2067>] __pagevec_release+0x27/0x40 [ 317.929817] [<ffffffffa00b0a55>] extent_write_cache_pages.isra.44.constprop.59+0x355/0x430 [btrfs] [ 317.943073] [<ffffffffa00b0fcd>] extent_writepages+0x5d/0x90 [btrfs] [ 317.956033] [<ffffffffa008ea60>] ? btrfs_releasepage+0x40/0x40 [btrfs] [ 317.968902] [<ffffffffa008b6a8>] btrfs_writepages+0x28/0x30 [btrfs] [ 317.981744] [<ffffffff811be6d1>] do_writepages+0x21/0x30 [ 317.994524] [<ffffffff812727cf>] __writeback_single_inode+0x7f/0x6e0 [ 318.007333] [<ffffffff81894cc1>] ? _raw_spin_unlock+0x31/0x50 [ 318.020090] [<ffffffff8127356d>] writeback_sb_inodes+0x31d/0x6c0 [ 318.032844] [<ffffffff812739a2>] __writeback_inodes_wb+0x92/0xc0 [ 318.045623] [<ffffffff81273e97>] wb_writeback+0x3f7/0x530 [ 318.058304] [<ffffffff81274a38>] wb_workfn+0x148/0x620 [ 318.070949] [<ffffffff810a4b54>] ? process_one_work+0x194/0x690 [ 318.083549] [<ffffffff812748f5>] ? wb_workfn+0x5/0x620 [ 318.096160] [<ffffffff810a4bf2>] process_one_work+0x232/0x690 [ 318.108722] [<ffffffff810a4b54>] ? process_one_work+0x194/0x690 [ 318.121225] [<ffffffff810a509e>] worker_thread+0x4e/0x490 [ 318.133680] [<ffffffff810a5050>] ? process_one_work+0x690/0x690 [ 318.146126] [<ffffffff810a5050>] ? process_one_work+0x690/0x690 [ 318.158507] [<ffffffff810aa7e2>] kthread+0x102/0x120 [ 318.170880] [<ffffffff810aa6e0>] ? kthread_park+0x60/0x60 [ 318.183254] [<ffffffff81895822>] ret_from_fork+0x22/0x30 This time, I managed to get my ftrace magic working, so we have this trace from just before this happened. Does this shed any light ? https://codemonkey.org.uk/junk/trace.txt (note that some of the spinlock/rcu calls will be missing here. I chose to add them to ftrace's exclude list because they dominate the trace buffer so much, especially in debug builds) Dave -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html