Hi David, because this bisected to a patch you posted Hi Alistair, because vmf_insert_page_mkwrite() is in the path
A DAX unit test began failing on 6.15-rc1. I chased it as described below, but need XFS and/or your Folio/tail page accounting knowledge to take it further. A DAX XFS mappings that is SHARED and R/W fails when the folio is unexpectedly NULL. Note that XFS PRIVATE always succeeds and XFS SHARED, READ_ONLY works fine. Also note that it works all the ways with EXT4. [ 417.796271] BUG: kernel NULL pointer dereference, address: 0000000000000b00 [ 417.796982] #PF: supervisor read access in kernel mode [ 417.797540] #PF: error_code(0x0000) - not-present page [ 417.798123] PGD 2a5c5067 P4D 2a5c5067 PUD 2a5c6067 PMD 0 [ 417.798690] Oops: Oops: 0000 [#1] SMP NOPTI [ 417.799178] CPU: 5 UID: 0 PID: 1515 Comm: mmap Tainted: G O 6.15.0-rc1-dirty #158 PREEMPT(voluntary) [ 417.800150] Tainted: [O]=OOT_MODULE [ 417.800583] Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS 0.0.0 02/06/2015 [ 417.801358] RIP: 0010:__lruvec_stat_mod_folio+0x7e/0x250 [ 417.801948] Code: 85 97 00 00 00 48 8b 43 38 48 89 c3 48 83 e3 f8 a8 02 0f 85 1a 01 00 00 48 85 db 0f 84 28 01 00 00 66 90 49 63 86 80 3e 00 00 <48> 8b 9c c3 00 09 00 00 48 83 c3 40 4c 3b b3 c0 00 00 00 0f 85 68 [ 417.803662] RSP: 0000:ffffc90002be3a08 EFLAGS: 00010206 [ 417.804234] RAX: 0000000000000000 RBX: 0000000000000200 RCX: 0000000000000002 [ 417.804984] RDX: ffffffff815652d7 RSI: 0000000000000000 RDI: ffffffff82a2beae [ 417.805689] RBP: ffffc90002be3a28 R08: 0000000000000000 R09: 0000000000000000 [ 417.806384] R10: ffffea0007000040 R11: ffff888376ffe000 R12: 0000000000000001 [ 417.807099] R13: 0000000000000012 R14: ffff88807fe4ab40 R15: ffff888029210580 [ 417.807801] FS: 00007f339fa7a740(0000) GS:ffff8881fa9b9000(0000) knlGS:0000000000000000 [ 417.808570] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 [ 417.809193] CR2: 0000000000000b00 CR3: 000000002a4f0004 CR4: 0000000000370ef0 [ 417.809925] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 [ 417.810622] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400 [ 417.811353] Call Trace: [ 417.811709] <TASK> [ 417.812038] folio_add_file_rmap_ptes+0x143/0x230 [ 417.812566] insert_page_into_pte_locked+0x1ee/0x3c0 [ 417.813132] insert_page+0x78/0xf0 [ 417.813558] vmf_insert_page_mkwrite+0x55/0xa0 [ 417.814088] dax_fault_iter+0x484/0x7b0 [ 417.814542] dax_iomap_pte_fault+0x1ca/0x620 [ 417.815055] dax_iomap_fault+0x39/0x40 [ 417.815499] __xfs_write_fault+0x139/0x380 [ 417.815995] ? __handle_mm_fault+0x5e5/0x1a60 [ 417.816483] xfs_write_fault+0x41/0x50 [ 417.816966] xfs_filemap_fault+0x3b/0xe0 [ 417.817424] __do_fault+0x31/0x180 [ 417.817859] __handle_mm_fault+0xee1/0x1a60 [ 417.818325] ? debug_smp_processor_id+0x17/0x20 [ 417.818844] handle_mm_fault+0xe1/0x2b0 [ 417.819286] do_user_addr_fault+0x217/0x630 [ 417.819747] ? rcu_is_watching+0x11/0x50 [ 417.820185] exc_page_fault+0x6c/0x210 [ 417.820599] asm_exc_page_fault+0x27/0x30 [ 417.821080] RIP: 0033:0x40130c [ 417.821461] Code: 89 7d d8 48 89 75 d0 e8 94 ff ff ff 48 c7 45 f8 00 00 00 00 48 8b 45 d8 48 89 45 f0 eb 18 48 8b 45 f0 48 8d 50 08 48 89 55 f0 <48> c7 00 01 00 00 00 48 83 45 f8 01 48 8b 45 d0 48 c1 e8 03 48 39 [ 417.823156] RSP: 002b:00007ffcc82a8cb0 EFLAGS: 00010287 [ 417.823703] RAX: 00007f336f5f5000 RBX: 00007ffcc82a8f08 RCX: 0000000067f5a1da [ 417.824382] RDX: 00007f336f5f5008 RSI: 0000000000000000 RDI: 0000000000036a98 [ 417.825096] RBP: 00007ffcc82a8ce0 R08: 00007f339fa84000 R09: 00000000004040b0 [ 417.825769] R10: 00007f339fa8a200 R11: 00007f339fa8a7b0 R12: 0000000000000000 [ 417.826438] R13: 00007ffcc82a8f28 R14: 0000000000403e18 R15: 00007f339fac3000 [ 417.827148] </TASK> [ 417.827461] Modules linked in: nd_pmem(O) dax_pmem(O) nd_btt(O) nfit(O) nd_e820(O) libnvdimm(O) nfit_test_iomap(O) [ 417.828404] CR2: 0000000000000b00 [ 417.828807] ---[ end trace 0000000000000000 ]--- [ 417.829293] RIP: 0010:__lruvec_stat_mod_folio+0x7e/0x250 And then, looking at the page passed to vmf_insert_page_mkwrite(): [ 55.468109] flags: 0x300000000002009(locked|uptodate|reserved|node=0|zone=3) [ 55.468674] raw: 0300000000002009 ffff888028c27b20 00000000ffffffff ffff888033b69b88 [ 55.469270] raw: 000000000000fff5 0000000000000000 00000001ffffffff 0000000000000200 [ 55.469835] page dumped because: ALISON dump locked & uptodate pages ^ That's different: locked|uptodate. Other page flags arriving here are not locked | uptodate. Git bisect says this is first bad patch (6.14 --> 6.15-rc1) 4996fc547f5b ("mm: let _folio_nr_pages overlay memcg_data in first tail page") Experimenting a bit with the patch, UN-defining NR_PAGES_IN_LARGE_FOLIO, avoids the problem. The way that patch is reusing memory in tail pages and the fact that it only fails in XFS (not ext4) suggests the XFS is depending on tail pages in a way that ext4 does not. And that's as far as I've gotten. -- Alison