Re: [PATCH v2 5/9] iomap: Support arbitrarily many blocks per page
On Tue, Sep 22, 2020 at 06:05:26PM +0100, Matthew Wilcox wrote: > On Tue, Sep 22, 2020 at 12:23:45PM -0400, Qian Cai wrote: > > On Fri, 2020-09-11 at 00:47 +0100, Matthew Wilcox (Oracle) wrote: > > > Size the uptodate array dynamically to support larger pages in the > > > page cache. With a 64kB page, we're only saving 8 bytes per page today, > > > but with a 2MB maximum page size, we'd have to allocate more than 4kB > > > per page. Add a few debugging assertions. > > > > > > Signed-off-by: Matthew Wilcox (Oracle) > > > Reviewed-by: Dave Chinner > > > > Some syscall fuzzing will trigger this on powerpc: > > > > .config: https://gitlab.com/cailca/linux-mm/-/blob/master/powerpc.config > > > > [ 8805.895344][T445431] WARNING: CPU: 61 PID: 445431 at > > fs/iomap/buffered-io.c:78 iomap_page_release+0x250/0x270 > > Well, I'm glad it triggered. That warning is: > WARN_ON_ONCE(bitmap_full(iop->uptodate, nr_blocks) != > PageUptodate(page)); > so there was definitely a problem of some kind. OK, I'm pretty sure the problem predated this commit, and it's simply that I added a warning (looking for something else) that caught this. I have a tree completly gunked up with debugging code now to try to understand the problem better, but my guess is that if you put this warning into a previous version, you'd see the same problem occurring (and it is a real problem, because we skip writeback of parts of the page which are !uptodate). ___ Linux-nvdimm mailing list -- linux-nvdimm@lists.01.org To unsubscribe send an email to linux-nvdimm-le...@lists.01.org
Re: [PATCH v2 5/9] iomap: Support arbitrarily many blocks per page
On Tue, Sep 22, 2020 at 10:00:01PM -0700, Darrick J. Wong wrote: > On Wed, Sep 23, 2020 at 03:48:59AM +0100, Matthew Wilcox wrote: > > On Tue, Sep 22, 2020 at 09:06:03PM -0400, Qian Cai wrote: > > > On Tue, 2020-09-22 at 18:05 +0100, Matthew Wilcox wrote: > > > > On Tue, Sep 22, 2020 at 12:23:45PM -0400, Qian Cai wrote: > > > > > On Fri, 2020-09-11 at 00:47 +0100, Matthew Wilcox (Oracle) wrote: > > > > > > Size the uptodate array dynamically to support larger pages in the > > > > > > page cache. With a 64kB page, we're only saving 8 bytes per page > > > > > > today, > > > > > > but with a 2MB maximum page size, we'd have to allocate more than > > > > > > 4kB > > > > > > per page. Add a few debugging assertions. > > > > > > > > > > > > Signed-off-by: Matthew Wilcox (Oracle) > > > > > > Reviewed-by: Dave Chinner > > > > > > > > > > Some syscall fuzzing will trigger this on powerpc: > > > > > > > > > > .config: > > > > > https://gitlab.com/cailca/linux-mm/-/blob/master/powerpc.config > > > > > > > > > > [ 8805.895344][T445431] WARNING: CPU: 61 PID: 445431 at > > > > > fs/iomap/buffered- > > > > > io.c:78 iomap_page_release+0x250/0x270 > > > > > > > > Well, I'm glad it triggered. That warning is: > > > > WARN_ON_ONCE(bitmap_full(iop->uptodate, nr_blocks) != > > > > PageUptodate(page)); > > > > so there was definitely a problem of some kind. > > > > > > > > truncate_cleanup_page() calls > > > > do_invalidatepage() calls > > > > iomap_invalidatepage() calls > > > > iomap_page_release() > > > > > > > > Is this the first warning? I'm wondering if maybe there was an I/O > > > > error > > > > earlier which caused PageUptodate to get cleared again. If it's easy to > > > > reproduce, perhaps you could try something like this? > > > > > > > > +void dump_iomap_page(struct page *page, const char *reason) > > > > +{ > > > > + struct iomap_page *iop = to_iomap_page(page); > > > > + unsigned int nr_blocks = i_blocks_per_page(page->mapping->host, > > > > page); > > > > + > > > > + dump_page(page, reason); > > > > + if (iop) > > > > + printk("iop:reads %d writes %d uptodate %*pb\n", > > > > + atomic_read(>read_bytes_pending), > > > > + atomic_read(>write_bytes_pending), > > > > + nr_blocks, iop->uptodate); > > > > + else > > > > + printk("iop:none\n"); > > > > +} > > > > > > > > and then do something like: > > > > > > > > if (bitmap_full(iop->uptodate, nr_blocks) != PageUptodate(page)) > > > > dump_iomap_page(page, NULL); > > > > > > This: > > > > > > [ 1683.158254][T164965] page:4a6c16cd refcount:2 mapcount:0 > > > mapping:ea017dc5 index:0x2 pfn:0xc365c > > > [ 1683.158311][T164965] aops:xfs_address_space_operations ino:417b7e7 > > > dentry name:"trinity-testfile2" > > > [ 1683.158354][T164965] flags: 0x7fff800015(locked|uptodate|lru) > > > [ 1683.158392][T164965] raw: 007fff800015 c00c019c4b08 > > > c00c019a53c8 c000201c8362c1e8 > > > [ 1683.158430][T164965] raw: 0002 > > > 0002 c000201c54db4000 > > > [ 1683.158470][T164965] page->mem_cgroup:c000201c54db4000 > > > [ 1683.158506][T164965] iop:none > > > > Oh, I'm a fool. This is after the call to detach_page_private() so > > page->private is NULL and we don't get the iop dumped. > > > > Nevertheless, this is interesting. Somehow, the page is marked Uptodate, > > but the bitmap is deemed not full. There are three places where we set > > an iomap page Uptodate: > > > > 1. if (bitmap_full(iop->uptodate, i_blocks_per_page(inode, page))) > > SetPageUptodate(page); > > > > 2. if (page_has_private(page)) > > iomap_iop_set_range_uptodate(page, off, len); > > else > > SetPageUptodate(page); > > > > 3. BUG_ON(page->index); > > ... > > SetPageUptodate(page); > > > > It can't be #2 because the page has an iop. It can't be #3 because the > > page->index is not 0. So at some point in the past, the bitmap was full. > > > > I don't think it's possible for inode->i_blksize to change, and you > > aren't running with THPs, so it's definitely not possible for thp_size() > > to change. So i_blocks_per_page() isn't going to change. > > > > We seem to have allocated enough memory for ->iop because that's also > > based on i_blocks_per_page(). > > > > I'm out of ideas. Maybe I'll wake up with a better idea in the morning. > > I've been trying to reproduce this on x86 with a 1kB block size > > filesystem, and haven't been able to yet. Maybe I'll try to setup a > > powerpc cross-compilation environment tomorrow. > > FWIW I managed to reproduce it with the following fstests configuration > on a 1k block size fs on a x86 machinE: > > SECTION -- -no-sections- > FSTYP-- xfs > MKFS_OPTIONS --
Re: [PATCH v2 5/9] iomap: Support arbitrarily many blocks per page
On Wed, 2020-09-23 at 03:48 +0100, Matthew Wilcox wrote: > I'm out of ideas. Maybe I'll wake up with a better idea in the morning. > I've been trying to reproduce this on x86 with a 1kB block size > filesystem, and haven't been able to yet. Maybe I'll try to setup a > powerpc cross-compilation environment tomorrow. Another data point is that this can be reproduced on both arm64 and powerpc (both have 64k-size pages) by running this LTP test case: https://github.com/linux-test-project/ltp/blob/master/testcases/kernel/syscalls/preadv2/preadv203.c [ 2397.723289] preadv203 (80525): drop_caches: 3 [ 2400.796402] XFS (loop0): Unmounting Filesystem [ 2401.431293] [ cut here ] [ 2401.431411] WARNING: CPU: 0 PID: 80501 at fs/iomap/buffered-io.c:78 iomap_page_release+0x250/0x270 [ 2401.431435] Modules linked in: vfio_pci vfio_virqfd vfio_iommu_spapr_tce vfio vfio_spapr_eeh loop kvm_hv kvm ip_tables x_tables sd_mod bnx2x mdio ahci libahci tg3 libphy libata firmware_class dm_mirror dm_reg ion_hash dm_log dm_mod [ 2401.431534] CPU: 0 PID: 80501 Comm: preadv203 Not tainted 5.9.0-rc6-next-20200923+ #4 [ 2401.431567] NIP: c0473b30 LR: c04739ec CTR: c0473e80 [ 2401.431590] REGS: c011a7bbf7a0 TRAP: 0700 Not tainted (5.9.0-rc6-next-20200923+) [ 2401.431613] MSR: 90029033 CR: 44000244 XER: 2004 [ 2401.431655] CFAR: c0473a24 IRQMASK: 0 GPR00: c029b6a8 c011a7bbfa30 c7e87e00 0005 GPR04: 0005 0005 GPR08: c00c000806f1f587 087fff800037 0001 c875e028 GPR12: c0473e80 cc18 GPR16: c011a7bbfbb0 c011a7bbfbc0 GPR20: c0a4aad8 000f GPR24: c011a7bbfb38 c011a7bbfac0 GPR28: 0001 c00ea4fe0a28 c00c0008071b46c0 [ 2401.431856] NIP [c0473b30] iomap_page_release+0x250/0x270 [ 2401.431878] LR [c04739ec] iomap_page_release+0x10c/0x270 [ 2401.431899] Call Trace: [ 2401.431926] [c011a7bbfa30] [c011a7bbfac0] 0xc011a7bbfac0 (unreliable) [ 2401.431962] [c011a7bbfa70] [c029b6a8] truncate_cleanup_page+0x188/0x2e0 [ 2401.431987] [c011a7bbfaa0] [c029c62c] truncate_inode_pages_range+0x23c/0x9f0 [ 2401.432023] [c011a7bbfcc0] [c03d9588] evict+0x1e8/0x200 [ 2401.432055] [c011a7bbfd00] [c03c64b0] do_unlinkat+0x2d0/0x3a0 [ 2401.432081] [c011a7bbfdc0] [c002a458] system_call_exception+0xf8/0x1d0 [ 2401.432097] [c011a7bbfe20] [c000d0a8] system_call_common+0xe8/0x218 [ 2401.432120] Instruction dump: [ 2401.432140] 3c82f8ba 388490b8 4be655b1 6000 0fe0 6000 6000 6000 [ 2401.432176] 0fe0 4bfffea8 6000 6000 <0fe0> 4bfffef4 6000 6000 [ 2401.432213] CPU: 0 PID: 80501 Comm: preadv203 Not tainted 5.9.0-rc6-next-20200923+ #4 [ 2401.432245] Call Trace: [ 2401.432255] [c011a7bbf590] [c0644778] dump_stack+0xec/0x144 (unreliable) [ 2401.432284] [c011a7bbf5d0] [c00b0a24] __warn+0xc4/0x144 [ 2401.432298] [c011a7bbf660] [c0643388] report_bug+0x108/0x1f0 [ 2401.432331] [c011a7bbf700] [c0021714] program_check_exception+0x104/0x2e0 [ 2401.432359] [c011a7bbf730] [c0009664] program_check_common_virt+0x2c4/0x310 [ 2401.432385] --- interrupt: 700 at iomap_page_release+0x250/0x270 LR = iomap_page_release+0x10c/0x270 [ 2401.432420] [c011a7bbfa30] [c011a7bbfac0] 0xc011a7bbfac0 (unreliable) [ 2401.432446] [c011a7bbfa70] [c029b6a8] truncate_cleanup_page+0x188/0x2e0 [ 2401.432470] [c011a7bbfaa0] [c029c62c] truncate_inode_pages_range+0x23c/0x9f0 [ 2401.432505] [c011a7bbfcc0] [c03d9588] evict+0x1e8/0x200 [ 2401.432528] [c011a7bbfd00] [c03c64b0] do_unlinkat+0x2d0/0x3a0 [ 2401.432552] [c011a7bbfdc0] [c002a458] system_call_exception+0xf8/0x1d0 [ 2401.432576] [c011a7bbfe20] [c000d0a8] system_call_common+0xe8/0x218 [ 2401.432599] irq event stamp: 210662 [ 2401.432619] hardirqs last enabled at (210661): [] _raw_spin_unlock_irq+0x3c/0x70 [ 2401.432652] hardirqs last disabled at (210662): [] program_check_common_virt+0x2bc/0x310 [ 2401.432688] softirqs last enabled at (210280): [] xfs_buf_find.isra.31+0xb54/0x1060 [ 2401.432722] softirqs last disabled at (210278): [] xfs_buf_find.isra.31+0x46c/0x1060 ___ Linux-nvdimm mailing list -- linux-nvdimm@lists.01.org To unsubscribe send an email to linux-nvdimm-le...@lists.01.org
Re: [PATCH v2 5/9] iomap: Support arbitrarily many blocks per page
On Wed, Sep 23, 2020 at 03:48:59AM +0100, Matthew Wilcox wrote: > On Tue, Sep 22, 2020 at 09:06:03PM -0400, Qian Cai wrote: > > On Tue, 2020-09-22 at 18:05 +0100, Matthew Wilcox wrote: > > > On Tue, Sep 22, 2020 at 12:23:45PM -0400, Qian Cai wrote: > > > > On Fri, 2020-09-11 at 00:47 +0100, Matthew Wilcox (Oracle) wrote: > > > > > Size the uptodate array dynamically to support larger pages in the > > > > > page cache. With a 64kB page, we're only saving 8 bytes per page > > > > > today, > > > > > but with a 2MB maximum page size, we'd have to allocate more than 4kB > > > > > per page. Add a few debugging assertions. > > > > > > > > > > Signed-off-by: Matthew Wilcox (Oracle) > > > > > Reviewed-by: Dave Chinner > > > > > > > > Some syscall fuzzing will trigger this on powerpc: > > > > > > > > .config: https://gitlab.com/cailca/linux-mm/-/blob/master/powerpc.config > > > > > > > > [ 8805.895344][T445431] WARNING: CPU: 61 PID: 445431 at > > > > fs/iomap/buffered- > > > > io.c:78 iomap_page_release+0x250/0x270 > > > > > > Well, I'm glad it triggered. That warning is: > > > WARN_ON_ONCE(bitmap_full(iop->uptodate, nr_blocks) != > > > PageUptodate(page)); > > > so there was definitely a problem of some kind. > > > > > > truncate_cleanup_page() calls > > > do_invalidatepage() calls > > > iomap_invalidatepage() calls > > > iomap_page_release() > > > > > > Is this the first warning? I'm wondering if maybe there was an I/O error > > > earlier which caused PageUptodate to get cleared again. If it's easy to > > > reproduce, perhaps you could try something like this? > > > > > > +void dump_iomap_page(struct page *page, const char *reason) > > > +{ > > > + struct iomap_page *iop = to_iomap_page(page); > > > + unsigned int nr_blocks = i_blocks_per_page(page->mapping->host, > > > page); > > > + > > > + dump_page(page, reason); > > > + if (iop) > > > + printk("iop:reads %d writes %d uptodate %*pb\n", > > > + atomic_read(>read_bytes_pending), > > > + atomic_read(>write_bytes_pending), > > > + nr_blocks, iop->uptodate); > > > + else > > > + printk("iop:none\n"); > > > +} > > > > > > and then do something like: > > > > > > if (bitmap_full(iop->uptodate, nr_blocks) != PageUptodate(page)) > > > dump_iomap_page(page, NULL); > > > > This: > > > > [ 1683.158254][T164965] page:4a6c16cd refcount:2 mapcount:0 > > mapping:ea017dc5 index:0x2 pfn:0xc365c > > [ 1683.158311][T164965] aops:xfs_address_space_operations ino:417b7e7 > > dentry name:"trinity-testfile2" > > [ 1683.158354][T164965] flags: 0x7fff800015(locked|uptodate|lru) > > [ 1683.158392][T164965] raw: 007fff800015 c00c019c4b08 > > c00c019a53c8 c000201c8362c1e8 > > [ 1683.158430][T164965] raw: 0002 > > 0002 c000201c54db4000 > > [ 1683.158470][T164965] page->mem_cgroup:c000201c54db4000 > > [ 1683.158506][T164965] iop:none > > Oh, I'm a fool. This is after the call to detach_page_private() so > page->private is NULL and we don't get the iop dumped. > > Nevertheless, this is interesting. Somehow, the page is marked Uptodate, > but the bitmap is deemed not full. There are three places where we set > an iomap page Uptodate: > > 1. if (bitmap_full(iop->uptodate, i_blocks_per_page(inode, page))) > SetPageUptodate(page); > > 2. if (page_has_private(page)) > iomap_iop_set_range_uptodate(page, off, len); > else > SetPageUptodate(page); > > 3. BUG_ON(page->index); > ... > SetPageUptodate(page); > > It can't be #2 because the page has an iop. It can't be #3 because the > page->index is not 0. So at some point in the past, the bitmap was full. > > I don't think it's possible for inode->i_blksize to change, and you > aren't running with THPs, so it's definitely not possible for thp_size() > to change. So i_blocks_per_page() isn't going to change. > > We seem to have allocated enough memory for ->iop because that's also > based on i_blocks_per_page(). > > I'm out of ideas. Maybe I'll wake up with a better idea in the morning. > I've been trying to reproduce this on x86 with a 1kB block size > filesystem, and haven't been able to yet. Maybe I'll try to setup a > powerpc cross-compilation environment tomorrow. FWIW I managed to reproduce it with the following fstests configuration on a 1k block size fs on a x86 machinE: SECTION -- -no-sections- FSTYP-- xfs MKFS_OPTIONS -- -m reflink=1,rmapbt=1 -i sparse=1 -b size=1024 MOUNT_OPTIONS -- -o usrquota,grpquota,prjquota HOST_OPTIONS -- local.config CHECK_OPTIONS -- -g auto XFS_MKFS_OPTIONS -- -bsize=4096 TIME_FACTOR -- 1 LOAD_FACTOR -- 1 TEST_DIR -- /mnt TEST_DEV -- /dev/sde SCRATCH_DEV -- /dev/sdd SCRATCH_MNT -- /opt OVL_UPPER--
Re: [PATCH v2 5/9] iomap: Support arbitrarily many blocks per page
On Tue, Sep 22, 2020 at 09:06:03PM -0400, Qian Cai wrote: > On Tue, 2020-09-22 at 18:05 +0100, Matthew Wilcox wrote: > > On Tue, Sep 22, 2020 at 12:23:45PM -0400, Qian Cai wrote: > > > On Fri, 2020-09-11 at 00:47 +0100, Matthew Wilcox (Oracle) wrote: > > > > Size the uptodate array dynamically to support larger pages in the > > > > page cache. With a 64kB page, we're only saving 8 bytes per page today, > > > > but with a 2MB maximum page size, we'd have to allocate more than 4kB > > > > per page. Add a few debugging assertions. > > > > > > > > Signed-off-by: Matthew Wilcox (Oracle) > > > > Reviewed-by: Dave Chinner > > > > > > Some syscall fuzzing will trigger this on powerpc: > > > > > > .config: https://gitlab.com/cailca/linux-mm/-/blob/master/powerpc.config > > > > > > [ 8805.895344][T445431] WARNING: CPU: 61 PID: 445431 at fs/iomap/buffered- > > > io.c:78 iomap_page_release+0x250/0x270 > > > > Well, I'm glad it triggered. That warning is: > > WARN_ON_ONCE(bitmap_full(iop->uptodate, nr_blocks) != > > PageUptodate(page)); > > so there was definitely a problem of some kind. > > > > truncate_cleanup_page() calls > > do_invalidatepage() calls > > iomap_invalidatepage() calls > > iomap_page_release() > > > > Is this the first warning? I'm wondering if maybe there was an I/O error > > earlier which caused PageUptodate to get cleared again. If it's easy to > > reproduce, perhaps you could try something like this? > > > > +void dump_iomap_page(struct page *page, const char *reason) > > +{ > > + struct iomap_page *iop = to_iomap_page(page); > > + unsigned int nr_blocks = i_blocks_per_page(page->mapping->host, > > page); > > + > > + dump_page(page, reason); > > + if (iop) > > + printk("iop:reads %d writes %d uptodate %*pb\n", > > + atomic_read(>read_bytes_pending), > > + atomic_read(>write_bytes_pending), > > + nr_blocks, iop->uptodate); > > + else > > + printk("iop:none\n"); > > +} > > > > and then do something like: > > > > if (bitmap_full(iop->uptodate, nr_blocks) != PageUptodate(page)) > > dump_iomap_page(page, NULL); > > This: > > [ 1683.158254][T164965] page:4a6c16cd refcount:2 mapcount:0 > mapping:ea017dc5 index:0x2 pfn:0xc365c > [ 1683.158311][T164965] aops:xfs_address_space_operations ino:417b7e7 dentry > name:"trinity-testfile2" > [ 1683.158354][T164965] flags: 0x7fff800015(locked|uptodate|lru) > [ 1683.158392][T164965] raw: 007fff800015 c00c019c4b08 > c00c019a53c8 c000201c8362c1e8 > [ 1683.158430][T164965] raw: 0002 > 0002 c000201c54db4000 > [ 1683.158470][T164965] page->mem_cgroup:c000201c54db4000 > [ 1683.158506][T164965] iop:none Oh, I'm a fool. This is after the call to detach_page_private() so page->private is NULL and we don't get the iop dumped. Nevertheless, this is interesting. Somehow, the page is marked Uptodate, but the bitmap is deemed not full. There are three places where we set an iomap page Uptodate: 1. if (bitmap_full(iop->uptodate, i_blocks_per_page(inode, page))) SetPageUptodate(page); 2. if (page_has_private(page)) iomap_iop_set_range_uptodate(page, off, len); else SetPageUptodate(page); 3. BUG_ON(page->index); ... SetPageUptodate(page); It can't be #2 because the page has an iop. It can't be #3 because the page->index is not 0. So at some point in the past, the bitmap was full. I don't think it's possible for inode->i_blksize to change, and you aren't running with THPs, so it's definitely not possible for thp_size() to change. So i_blocks_per_page() isn't going to change. We seem to have allocated enough memory for ->iop because that's also based on i_blocks_per_page(). I'm out of ideas. Maybe I'll wake up with a better idea in the morning. I've been trying to reproduce this on x86 with a 1kB block size filesystem, and haven't been able to yet. Maybe I'll try to setup a powerpc cross-compilation environment tomorrow. ___ Linux-nvdimm mailing list -- linux-nvdimm@lists.01.org To unsubscribe send an email to linux-nvdimm-le...@lists.01.org
Re: [PATCH v2 5/9] iomap: Support arbitrarily many blocks per page
On Tue, 2020-09-22 at 18:05 +0100, Matthew Wilcox wrote: > On Tue, Sep 22, 2020 at 12:23:45PM -0400, Qian Cai wrote: > > On Fri, 2020-09-11 at 00:47 +0100, Matthew Wilcox (Oracle) wrote: > > > Size the uptodate array dynamically to support larger pages in the > > > page cache. With a 64kB page, we're only saving 8 bytes per page today, > > > but with a 2MB maximum page size, we'd have to allocate more than 4kB > > > per page. Add a few debugging assertions. > > > > > > Signed-off-by: Matthew Wilcox (Oracle) > > > Reviewed-by: Dave Chinner > > > > Some syscall fuzzing will trigger this on powerpc: > > > > .config: https://gitlab.com/cailca/linux-mm/-/blob/master/powerpc.config > > > > [ 8805.895344][T445431] WARNING: CPU: 61 PID: 445431 at fs/iomap/buffered- > > io.c:78 iomap_page_release+0x250/0x270 > > Well, I'm glad it triggered. That warning is: > WARN_ON_ONCE(bitmap_full(iop->uptodate, nr_blocks) != > PageUptodate(page)); > so there was definitely a problem of some kind. > > truncate_cleanup_page() calls > do_invalidatepage() calls > iomap_invalidatepage() calls > iomap_page_release() > > Is this the first warning? I'm wondering if maybe there was an I/O error > earlier which caused PageUptodate to get cleared again. If it's easy to > reproduce, perhaps you could try something like this? > > +void dump_iomap_page(struct page *page, const char *reason) > +{ > + struct iomap_page *iop = to_iomap_page(page); > + unsigned int nr_blocks = i_blocks_per_page(page->mapping->host, page); > + > + dump_page(page, reason); > + if (iop) > + printk("iop:reads %d writes %d uptodate %*pb\n", > + atomic_read(>read_bytes_pending), > + atomic_read(>write_bytes_pending), > + nr_blocks, iop->uptodate); > + else > + printk("iop:none\n"); > +} > > and then do something like: > > if (bitmap_full(iop->uptodate, nr_blocks) != PageUptodate(page)) > dump_iomap_page(page, NULL); This: [ 1683.158254][T164965] page:4a6c16cd refcount:2 mapcount:0 mapping:ea017dc5 index:0x2 pfn:0xc365c [ 1683.158311][T164965] aops:xfs_address_space_operations ino:417b7e7 dentry name:"trinity-testfile2" [ 1683.158354][T164965] flags: 0x7fff800015(locked|uptodate|lru) [ 1683.158392][T164965] raw: 007fff800015 c00c019c4b08 c00c019a53c8 c000201c8362c1e8 [ 1683.158430][T164965] raw: 0002 0002 c000201c54db4000 [ 1683.158470][T164965] page->mem_cgroup:c000201c54db4000 [ 1683.158506][T164965] iop:none ___ Linux-nvdimm mailing list -- linux-nvdimm@lists.01.org To unsubscribe send an email to linux-nvdimm-le...@lists.01.org
Re: [PATCH v2 5/9] iomap: Support arbitrarily many blocks per page
On Tue, 2020-09-22 at 18:05 +0100, Matthew Wilcox wrote: > On Tue, Sep 22, 2020 at 12:23:45PM -0400, Qian Cai wrote: > > On Fri, 2020-09-11 at 00:47 +0100, Matthew Wilcox (Oracle) wrote: > > > Size the uptodate array dynamically to support larger pages in the > > > page cache. With a 64kB page, we're only saving 8 bytes per page today, > > > but with a 2MB maximum page size, we'd have to allocate more than 4kB > > > per page. Add a few debugging assertions. > > > > > > Signed-off-by: Matthew Wilcox (Oracle) > > > Reviewed-by: Dave Chinner > > > > Some syscall fuzzing will trigger this on powerpc: > > > > .config: https://gitlab.com/cailca/linux-mm/-/blob/master/powerpc.config > > > > [ 8805.895344][T445431] WARNING: CPU: 61 PID: 445431 at fs/iomap/buffered- > > io.c:78 iomap_page_release+0x250/0x270 > > Well, I'm glad it triggered. That warning is: > WARN_ON_ONCE(bitmap_full(iop->uptodate, nr_blocks) != > PageUptodate(page)); > so there was definitely a problem of some kind. > > truncate_cleanup_page() calls > do_invalidatepage() calls > iomap_invalidatepage() calls > iomap_page_release() > > Is this the first warning? I'm wondering if maybe there was an I/O error > earlier which caused PageUptodate to get cleared again. If it's easy to > reproduce, perhaps you could try something like this? Yes, this is the first warning. BTW, I did run the reproducer of a805c111650c ("iomap: fix WARN_ON_ONCE() from unprivileged users") earlier, so I am wondering if this is just another victim WARN_ON_ONCE() from it. > > +void dump_iomap_page(struct page *page, const char *reason) > +{ > + struct iomap_page *iop = to_iomap_page(page); > + unsigned int nr_blocks = i_blocks_per_page(page->mapping->host, page); > + > + dump_page(page, reason); > + if (iop) > + printk("iop:reads %d writes %d uptodate %*pb\n", > + atomic_read(>read_bytes_pending), > + atomic_read(>write_bytes_pending), > + nr_blocks, iop->uptodate); > + else > + printk("iop:none\n"); > +} > > and then do something like: > > if (bitmap_full(iop->uptodate, nr_blocks) != PageUptodate(page)) > dump_iomap_page(page, NULL); > ___ Linux-nvdimm mailing list -- linux-nvdimm@lists.01.org To unsubscribe send an email to linux-nvdimm-le...@lists.01.org
Re: [PATCH v2 5/9] iomap: Support arbitrarily many blocks per page
On Tue, Sep 22, 2020 at 12:23:45PM -0400, Qian Cai wrote: > On Fri, 2020-09-11 at 00:47 +0100, Matthew Wilcox (Oracle) wrote: > > Size the uptodate array dynamically to support larger pages in the > > page cache. With a 64kB page, we're only saving 8 bytes per page today, > > but with a 2MB maximum page size, we'd have to allocate more than 4kB > > per page. Add a few debugging assertions. > > > > Signed-off-by: Matthew Wilcox (Oracle) > > Reviewed-by: Dave Chinner > > Some syscall fuzzing will trigger this on powerpc: > > .config: https://gitlab.com/cailca/linux-mm/-/blob/master/powerpc.config > > [ 8805.895344][T445431] WARNING: CPU: 61 PID: 445431 at > fs/iomap/buffered-io.c:78 iomap_page_release+0x250/0x270 Well, I'm glad it triggered. That warning is: WARN_ON_ONCE(bitmap_full(iop->uptodate, nr_blocks) != PageUptodate(page)); so there was definitely a problem of some kind. truncate_cleanup_page() calls do_invalidatepage() calls iomap_invalidatepage() calls iomap_page_release() Is this the first warning? I'm wondering if maybe there was an I/O error earlier which caused PageUptodate to get cleared again. If it's easy to reproduce, perhaps you could try something like this? +void dump_iomap_page(struct page *page, const char *reason) +{ + struct iomap_page *iop = to_iomap_page(page); + unsigned int nr_blocks = i_blocks_per_page(page->mapping->host, page); + + dump_page(page, reason); + if (iop) + printk("iop:reads %d writes %d uptodate %*pb\n", + atomic_read(>read_bytes_pending), + atomic_read(>write_bytes_pending), + nr_blocks, iop->uptodate); + else + printk("iop:none\n"); +} and then do something like: if (bitmap_full(iop->uptodate, nr_blocks) != PageUptodate(page)) dump_iomap_page(page, NULL); ___ Linux-nvdimm mailing list -- linux-nvdimm@lists.01.org To unsubscribe send an email to linux-nvdimm-le...@lists.01.org
Re: [PATCH v2 5/9] iomap: Support arbitrarily many blocks per page
On Fri, 2020-09-11 at 00:47 +0100, Matthew Wilcox (Oracle) wrote: > Size the uptodate array dynamically to support larger pages in the > page cache. With a 64kB page, we're only saving 8 bytes per page today, > but with a 2MB maximum page size, we'd have to allocate more than 4kB > per page. Add a few debugging assertions. > > Signed-off-by: Matthew Wilcox (Oracle) > Reviewed-by: Dave Chinner Some syscall fuzzing will trigger this on powerpc: .config: https://gitlab.com/cailca/linux-mm/-/blob/master/powerpc.config [ 8805.895344][T445431] WARNING: CPU: 61 PID: 445431 at fs/iomap/buffered-io.c:78 iomap_page_release+0x250/0x270 [ 8805.895376][T445431] Modules linked in: vfio_pci vfio_virqfd vfio_iommu_spapr_tce vfio vfio_spapr_eeh loop kvm_hv kvm ip_tables x_tables sd_mod bnx2x tg3 ahci libahci mdio libphy firmware_class libata dm_mirror dm_region_hash dm_log dm_mod [ 8805.895521][T445431] CPU: 61 PID: 445431 Comm: trinity-c61 Not tainted 5.9.0-rc6-next-20200922 #3 [ 8805.895551][T445431] NIP: c04734a0 LR: c047335c CTR: [ 8805.895571][T445431] REGS: c01fe5427620 TRAP: 0700 Not tainted (5.9.0-rc6-next-20200922) [ 8805.895609][T445431] MSR: 90029033 CR: 44002244 XER: 2004 [ 8805.895671][T445431] CFAR: c0473394 IRQMASK: 0 [ 8805.895671][T445431] GPR00: c029b698 c01fe54278b0 c5687e00 [ 8805.895671][T445431] GPR04: [ 8805.895671][T445431] GPR08: c00c02c4b5c7 007fff800015 0001 c5f5e028 [ 8805.895671][T445431] GPR12: 2000 c01c0580 10037be0 109ec8e0 [ 8805.895671][T445431] GPR16: 109eccb0 c01fe54279c0 c01fe5427a40 [ 8805.895671][T445431] GPR20: c0a4aa80 0008 000f [ 8805.895671][T445431] GPR24: c01fe54279b8 c01fe5427940 0001 [ 8805.895671][T445431] GPR28: 0001 c014dd4f9128 c00c06273a80 [ 8805.895989][T445431] NIP [c04734a0] iomap_page_release+0x250/0x270 [ 8805.896016][T445431] LR [c047335c] iomap_page_release+0x10c/0x270 [ 8805.896041][T445431] Call Trace: [ 8805.896065][T445431] [c01fe54278b0] [c01fe5427940] 0xc01fe5427940 (unreliable) [ 8805.896101][T445431] [c01fe54278f0] [c029b698] truncate_cleanup_page+0x188/0x2e0 [ 8805.896144][T445431] [c01fe5427920] [c029c61c] truncate_inode_pages_range+0x23c/0x9f0 [ 8805.896187][T445431] [c01fe5427b40] [c029ceec] truncate_pagecache+0x5c/0x90 [ 8805.896229][T445431] [c01fe5427b80] [c0569858] xfs_setattr_size+0xb8/0x5d0 [ 8805.896270][T445431] [c01fe5427c10] [c056a26c] xfs_vn_setattr+0x8c/0x130 [ 8805.896321][T445431] [c01fe5427c60] [c03dbe60] notify_change+0x390/0x5b0 [ 8805.896372][T445431] [c01fe5427cd0] [c03a5294] do_truncate+0x94/0x130 [ 8805.896412][T445431] [c01fe5427d60] [c03a57a8] do_sys_ftruncate+0xe8/0x160 [ 8805.896455][T445431] [c01fe5427dc0] [c002a458] system_call_exception+0xf8/0x1d0 [ 8805.896496][T445431] [c01fe5427e20] [c000d0a8] system_call_common+0xe8/0x218 [ 8805.896544][T445431] Instruction dump: [ 8805.896569][T445431] 3c82fb3a 388490c8 4be65be1 6000 0fe0 6000 6000 6000 [ 8805.896623][T445431] 0fe0 4bfffea8 6000 6000 <0fe0> 4bfffef4 6000 6000 [ 8805.896678][T445431] CPU: 61 PID: 445431 Comm: trinity-c61 Not tainted 5.9.0-rc6-next-20200922 #3 [ 8805.896716][T445431] Call Trace: [ 8805.896750][T445431] [c01fe5427410] [c06440e8] dump_stack+0xec/0x144 (unreliable) [ 8805.896796][T445431] [c01fe5427450] [c00b0a24] __warn+0xc4/0x144 [ 8805.896837][T445431] [c01fe54274e0] [c0642cf8] report_bug+0x108/0x1f0 [ 8805.896878][T445431] [c01fe5427580] [c0021714] program_check_exception+0x104/0x2e0 [ 8805.896922][T445431] [c01fe54275b0] [c0009664] program_check_common_virt+0x2c4/0x310 [ 8805.896965][T445431] --- interrupt: 700 at iomap_page_release+0x250/0x270 [ 8805.896965][T445431] LR = iomap_page_release+0x10c/0x270 [ 8805.897018][T445431] [c01fe54278b0] [c01fe5427940] 0xc01fe5427940 (unreliable) [ 8805.897053][T445431] [c01fe54278f0] [c029b698] truncate_cleanup_page+0x188/0x2e0 [ 8805.897104][T445431] [c01fe5427920] [c029c61c] truncate_inode_pages_range+0x23c/0x9f0 [ 8805.897146][T445431] [c01fe5427b40] [c029ceec] truncate_pagecache+0x5c/0x90 [ 8805.897186][T445431] [c01fe5427b80] [c0569858] xfs_setattr_size+0xb8/0x5d0 [ 8805.897226][T445431] [c01fe5427c10] [c056a26c] xfs_vn_setattr+0x8c/0x130 [ 8805.897266][T445431] [c01fe5427c60] [c03dbe60] notify_change+0x390/0x5b0 [
Re: [PATCH v2 5/9] iomap: Support arbitrarily many blocks per page
On Fri, Sep 11, 2020 at 12:47:03AM +0100, Matthew Wilcox (Oracle) wrote: > Size the uptodate array dynamically to support larger pages in the > page cache. With a 64kB page, we're only saving 8 bytes per page today, > but with a 2MB maximum page size, we'd have to allocate more than 4kB > per page. Add a few debugging assertions. > > Signed-off-by: Matthew Wilcox (Oracle) > Reviewed-by: Dave Chinner Looks ok, Reviewed-by: Darrick J. Wong --D > --- > fs/iomap/buffered-io.c | 22 +- > 1 file changed, 17 insertions(+), 5 deletions(-) > > diff --git a/fs/iomap/buffered-io.c b/fs/iomap/buffered-io.c > index 7fc0e02d27b0..9670c096b83e 100644 > --- a/fs/iomap/buffered-io.c > +++ b/fs/iomap/buffered-io.c > @@ -22,18 +22,25 @@ > #include "../internal.h" > > /* > - * Structure allocated for each page when block size < PAGE_SIZE to track > - * sub-page uptodate status and I/O completions. > + * Structure allocated for each page or THP when block size < page size > + * to track sub-page uptodate status and I/O completions. > */ > struct iomap_page { > atomic_tread_count; > atomic_twrite_count; > spinlock_t uptodate_lock; > - DECLARE_BITMAP(uptodate, PAGE_SIZE / 512); > + unsigned long uptodate[]; > }; > > static inline struct iomap_page *to_iomap_page(struct page *page) > { > + /* > + * per-block data is stored in the head page. Callers should > + * not be dealing with tail pages (and if they are, they can > + * call thp_head() first. > + */ > + VM_BUG_ON_PGFLAGS(PageTail(page), page); > + > if (page_has_private(page)) > return (struct iomap_page *)page_private(page); > return NULL; > @@ -45,11 +52,13 @@ static struct iomap_page * > iomap_page_create(struct inode *inode, struct page *page) > { > struct iomap_page *iop = to_iomap_page(page); > + unsigned int nr_blocks = i_blocks_per_page(inode, page); > > - if (iop || i_blocks_per_page(inode, page) <= 1) > + if (iop || nr_blocks <= 1) > return iop; > > - iop = kzalloc(sizeof(*iop), GFP_NOFS | __GFP_NOFAIL); > + iop = kzalloc(struct_size(iop, uptodate, BITS_TO_LONGS(nr_blocks)), > + GFP_NOFS | __GFP_NOFAIL); > spin_lock_init(>uptodate_lock); > attach_page_private(page, iop); > return iop; > @@ -59,11 +68,14 @@ static void > iomap_page_release(struct page *page) > { > struct iomap_page *iop = detach_page_private(page); > + unsigned int nr_blocks = i_blocks_per_page(page->mapping->host, page); > > if (!iop) > return; > WARN_ON_ONCE(atomic_read(>read_count)); > WARN_ON_ONCE(atomic_read(>write_count)); > + WARN_ON_ONCE(bitmap_full(iop->uptodate, nr_blocks) != > + PageUptodate(page)); > kfree(iop); > } > > -- > 2.28.0 > ___ Linux-nvdimm mailing list -- linux-nvdimm@lists.01.org To unsubscribe send an email to linux-nvdimm-le...@lists.01.org
Re: [PATCH v2 5/9] iomap: Support arbitrarily many blocks per page
On Fri, Sep 11, 2020 at 12:47:03AM +0100, Matthew Wilcox (Oracle) wrote: > Size the uptodate array dynamically to support larger pages in the > page cache. With a 64kB page, we're only saving 8 bytes per page today, > but with a 2MB maximum page size, we'd have to allocate more than 4kB > per page. Add a few debugging assertions. > > Signed-off-by: Matthew Wilcox (Oracle) > Reviewed-by: Dave Chinner Looks good, Reviewed-by: Christoph Hellwig ___ Linux-nvdimm mailing list -- linux-nvdimm@lists.01.org To unsubscribe send an email to linux-nvdimm-le...@lists.01.org