Re: [PATCH v8 04/10] btrfs: fix check_data_csum() error message for direct I/O
On Thu, Mar 18, 2021 at 07:47:29AM +0800, Qu Wenruo wrote: > > > On 2021/3/18 上午2:33, Omar Sandoval wrote: > > On Wed, Mar 17, 2021 at 07:21:46PM +0800, Qu Wenruo wrote: > >> > >> > >> On 2021/3/17 上午3:43, Omar Sandoval wrote: > >>> From: Omar Sandoval > >>> > >>> Commit 1dae796aabf6 ("btrfs: inode: sink parameter start and len to > >>> check_data_csum()") replaced the start parameter to check_data_csum() > >>> with page_offset(), but page_offset() is not meaningful for direct I/O > >>> pages. Bring back the start parameter. > >> > >> So direct IO pages doesn't have page::index set at all? > > > > No, they don't. Usually you do direct I/O into an anonymous page, but I > > suppose you could even do direct I/O into a page mmap'd from another > > file or filesystem. In either case, the index isn't meaningful for the > > file you're doing direct I/O from. > > > >> Any reproducer? I'd like to try to reproduce it first. > > > > The easiest way to see this issue is to apply this patch: > > > > diff --git a/fs/btrfs/inode.c b/fs/btrfs/inode.c > > index 2a92211439e8..a962b3026573 100644 > > --- a/fs/btrfs/inode.c > > +++ b/fs/btrfs/inode.c > > @@ -3114,6 +3114,9 @@ static int check_data_csum(struct inode *inode, > > struct btrfs_io_bio *io_bio, > > u8 *csum_expected; > > u8 csum[BTRFS_CSUM_SIZE]; > > > > + WARN_ONCE(page_offset(page) + pgoff != start, > > + "page offset %llu != start %llu\n", > > + page_offset(page) + pgoff, start); > > ASSERT(pgoff + len <= PAGE_SIZE); > > > > offset_sectors = bio_offset >> fs_info->sectorsize_bits; > > > > Run this simple test: > > > > $ dd if=/dev/zero of=foo bs=4k count=1024 > > 1024+0 records in > > 1024+0 records out > > 4194304 bytes (4.2 MB, 4.0 MiB) copied, 0.00456495 s, 919 MB/s > > $ sync > > $ dd if=foo of=/dev/null iflag=direct bs=4k > > 1024+0 records in > > 1024+0 records out > > 4194304 bytes (4.2 MB, 4.0 MiB) copied, 0.163079 s, 25.7 MB/s > > > > And you'll get a warning like: > > > > [ 84.896486] [ cut here ] > > [ 84.897370] page offset 94199157981184 != start 0 > > [ 84.898128] WARNING: CPU: 1 PID: 459 at fs/btrfs/inode.c:3119 > > check_data_csum+0x189/0x260 [btrfs] > > [ 84.899547] Modules linked in: btrfs blake2b_generic xor pata_acpi > > ata_piix libata scsi_mod raid6_pq virtio_net net_failover virtio_rng > > libcrc32c rng_core failover > > [ 84.901742] CPU: 1 PID: 459 Comm: kworker/u56:2 Not tainted > > 5.12.0-rc3-00060-ge0cd3910d8cb-dirty #139 > > [ 84.903205] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS > > rel-1.14.0-0-g155821a1990b-prebuilt.qemu.org 04/01/2014 > > [ 84.904875] Workqueue: btrfs-endio btrfs_work_helper [btrfs] > > [ 84.905749] RIP: 0010:check_data_csum+0x189/0x260 [btrfs] > > [ 84.906576] Code: 57 11 00 00 0f 85 03 ff ff ff 4c 89 ca 48 c7 c7 50 ba > > 35 c0 4c 89 44 24 10 48 89 44 24 08 c6 05 04 57 11 00 01 e8 22 e0 cf d4 > > <0f> 0b 4c 8b 44 24 10 48 8b 44 24 08 e9 d2 fe ff ff 41 8b 45 00 4d > > [ 84.909288] RSP: 0018:b6e9c164bb98 EFLAGS: 00010282 > > [ 84.910061] RAX: RBX: e96b84a05f40 RCX: > > 0001 > > [ 84.911109] RDX: 8001 RSI: 9573d067 RDI: > > > > [ 84.912149] RBP: R08: R09: > > c000dfff > > [ 84.913197] R10: 0001 R11: b6e9c164b9c0 R12: > > > > [ 84.914247] R13: 9d32a28c8dc0 R14: 9d32ac495e10 R15: > > 0004 > > [ 84.915304] FS: () GS:9d399f64() > > knlGS: > > [ 84.916478] CS: 0010 DS: ES: CR0: 80050033 > > [ 84.917340] CR2: 55ad52f97120 CR3: 0001292f4002 CR4: > > 00370ee0 > > [ 84.918435] DR0: DR1: DR2: > > > > [ 84.919473] DR3: DR6: fffe0ff0 DR7: > > 0400 > > [ 84.920515] Call Trace: > > [ 84.920884] ? find_busiest_group+0x41/0x380 > > [ 84.921518] ? load_balance+0x176/0xc10 > > [ 84.922082] ? kvm_sched_clock_read+0x5/0x10 > > [ 84.922711] ? sched_clock+0x5/0x10 > > [ 84.923236] btrfs_end_dio_bio+0x2fb/0x310 [btrfs] > > [ 84.923982] end_workqueue_fn+0x29/0x40 [btrfs] > > [ 84.924698] btrfs_work_helper+0xc1/0x350 [btrfs] > > [ 84.925435] process_one_work+0x1c8/0x390 > > [ 84.926025] ? process_one_work+0x390/0x390 > > [ 84.926650] worker_thread+0x30/0x370 > > [ 84.927209] ? process_one_work+0x390/0x390 > > [ 84.927875] kthread+0x13d/0x160 > > [ 84.928466] ? kthread_park+0x80/0x80 > > [ 84.929008] ret_from_fork+0x22/0x30 > > [ 84.929543] ---[ end trace 4f87c4a13fa476d4 ]--- > > > >>> > >>> Fixes: 265d4ac03fdf ("btrfs: sink parameter start and len to > >>> check_data_csum") > >>> Signed-off-by: Omar Sandoval > >>> --- > >>>fs/btrfs/inode.c | 14 +- > >>>1 file changed, 9 insertions(+), 5 deletions(
Re: [PATCH v8 04/10] btrfs: fix check_data_csum() error message for direct I/O
On 2021/3/18 上午2:33, Omar Sandoval wrote: On Wed, Mar 17, 2021 at 07:21:46PM +0800, Qu Wenruo wrote: On 2021/3/17 上午3:43, Omar Sandoval wrote: From: Omar Sandoval Commit 1dae796aabf6 ("btrfs: inode: sink parameter start and len to check_data_csum()") replaced the start parameter to check_data_csum() with page_offset(), but page_offset() is not meaningful for direct I/O pages. Bring back the start parameter. So direct IO pages doesn't have page::index set at all? No, they don't. Usually you do direct I/O into an anonymous page, but I suppose you could even do direct I/O into a page mmap'd from another file or filesystem. In either case, the index isn't meaningful for the file you're doing direct I/O from. Any reproducer? I'd like to try to reproduce it first. The easiest way to see this issue is to apply this patch: diff --git a/fs/btrfs/inode.c b/fs/btrfs/inode.c index 2a92211439e8..a962b3026573 100644 --- a/fs/btrfs/inode.c +++ b/fs/btrfs/inode.c @@ -3114,6 +3114,9 @@ static int check_data_csum(struct inode *inode, struct btrfs_io_bio *io_bio, u8 *csum_expected; u8 csum[BTRFS_CSUM_SIZE]; + WARN_ONCE(page_offset(page) + pgoff != start, + "page offset %llu != start %llu\n", + page_offset(page) + pgoff, start); ASSERT(pgoff + len <= PAGE_SIZE); offset_sectors = bio_offset >> fs_info->sectorsize_bits; Run this simple test: $ dd if=/dev/zero of=foo bs=4k count=1024 1024+0 records in 1024+0 records out 4194304 bytes (4.2 MB, 4.0 MiB) copied, 0.00456495 s, 919 MB/s $ sync $ dd if=foo of=/dev/null iflag=direct bs=4k 1024+0 records in 1024+0 records out 4194304 bytes (4.2 MB, 4.0 MiB) copied, 0.163079 s, 25.7 MB/s And you'll get a warning like: [ 84.896486] [ cut here ] [ 84.897370] page offset 94199157981184 != start 0 [ 84.898128] WARNING: CPU: 1 PID: 459 at fs/btrfs/inode.c:3119 check_data_csum+0x189/0x260 [btrfs] [ 84.899547] Modules linked in: btrfs blake2b_generic xor pata_acpi ata_piix libata scsi_mod raid6_pq virtio_net net_failover virtio_rng libcrc32c rng_core failover [ 84.901742] CPU: 1 PID: 459 Comm: kworker/u56:2 Not tainted 5.12.0-rc3-00060-ge0cd3910d8cb-dirty #139 [ 84.903205] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS rel-1.14.0-0-g155821a1990b-prebuilt.qemu.org 04/01/2014 [ 84.904875] Workqueue: btrfs-endio btrfs_work_helper [btrfs] [ 84.905749] RIP: 0010:check_data_csum+0x189/0x260 [btrfs] [ 84.906576] Code: 57 11 00 00 0f 85 03 ff ff ff 4c 89 ca 48 c7 c7 50 ba 35 c0 4c 89 44 24 10 48 89 44 24 08 c6 05 04 57 11 00 01 e8 22 e0 cf d4 <0f> 0b 4c 8b 44 24 10 48 8b 44 24 08 e9 d2 fe ff ff 41 8b 45 00 4d [ 84.909288] RSP: 0018:b6e9c164bb98 EFLAGS: 00010282 [ 84.910061] RAX: RBX: e96b84a05f40 RCX: 0001 [ 84.911109] RDX: 8001 RSI: 9573d067 RDI: [ 84.912149] RBP: R08: R09: c000dfff [ 84.913197] R10: 0001 R11: b6e9c164b9c0 R12: [ 84.914247] R13: 9d32a28c8dc0 R14: 9d32ac495e10 R15: 0004 [ 84.915304] FS: () GS:9d399f64() knlGS: [ 84.916478] CS: 0010 DS: ES: CR0: 80050033 [ 84.917340] CR2: 55ad52f97120 CR3: 0001292f4002 CR4: 00370ee0 [ 84.918435] DR0: DR1: DR2: [ 84.919473] DR3: DR6: fffe0ff0 DR7: 0400 [ 84.920515] Call Trace: [ 84.920884] ? find_busiest_group+0x41/0x380 [ 84.921518] ? load_balance+0x176/0xc10 [ 84.922082] ? kvm_sched_clock_read+0x5/0x10 [ 84.922711] ? sched_clock+0x5/0x10 [ 84.923236] btrfs_end_dio_bio+0x2fb/0x310 [btrfs] [ 84.923982] end_workqueue_fn+0x29/0x40 [btrfs] [ 84.924698] btrfs_work_helper+0xc1/0x350 [btrfs] [ 84.925435] process_one_work+0x1c8/0x390 [ 84.926025] ? process_one_work+0x390/0x390 [ 84.926650] worker_thread+0x30/0x370 [ 84.927209] ? process_one_work+0x390/0x390 [ 84.927875] kthread+0x13d/0x160 [ 84.928466] ? kthread_park+0x80/0x80 [ 84.929008] ret_from_fork+0x22/0x30 [ 84.929543] ---[ end trace 4f87c4a13fa476d4 ]--- Fixes: 265d4ac03fdf ("btrfs: sink parameter start and len to check_data_csum") Signed-off-by: Omar Sandoval --- fs/btrfs/inode.c | 14 +- 1 file changed, 9 insertions(+), 5 deletions(-) diff --git a/fs/btrfs/inode.c b/fs/btrfs/inode.c index ef6cb7b620d0..d2ece8554416 100644 --- a/fs/btrfs/inode.c +++ b/fs/btrfs/inode.c @@ -2947,11 +2947,13 @@ void btrfs_writepage_endio_finish_ordered(struct page *page, u64 start, * @bio_offset: offset to the beginning of the bio (in bytes) * @page:page where is the data to be verified * @pgoff: offset inside the page + * @start: logical offset in the file Please add some comment if only for direct IO we need th
Re: [PATCH v8 04/10] btrfs: fix check_data_csum() error message for direct I/O
On Wed, Mar 17, 2021 at 07:21:46PM +0800, Qu Wenruo wrote: > > > On 2021/3/17 上午3:43, Omar Sandoval wrote: > > From: Omar Sandoval > > > > Commit 1dae796aabf6 ("btrfs: inode: sink parameter start and len to > > check_data_csum()") replaced the start parameter to check_data_csum() > > with page_offset(), but page_offset() is not meaningful for direct I/O > > pages. Bring back the start parameter. > > So direct IO pages doesn't have page::index set at all? No, they don't. Usually you do direct I/O into an anonymous page, but I suppose you could even do direct I/O into a page mmap'd from another file or filesystem. In either case, the index isn't meaningful for the file you're doing direct I/O from. > Any reproducer? I'd like to try to reproduce it first. The easiest way to see this issue is to apply this patch: diff --git a/fs/btrfs/inode.c b/fs/btrfs/inode.c index 2a92211439e8..a962b3026573 100644 --- a/fs/btrfs/inode.c +++ b/fs/btrfs/inode.c @@ -3114,6 +3114,9 @@ static int check_data_csum(struct inode *inode, struct btrfs_io_bio *io_bio, u8 *csum_expected; u8 csum[BTRFS_CSUM_SIZE]; + WARN_ONCE(page_offset(page) + pgoff != start, + "page offset %llu != start %llu\n", + page_offset(page) + pgoff, start); ASSERT(pgoff + len <= PAGE_SIZE); offset_sectors = bio_offset >> fs_info->sectorsize_bits; Run this simple test: $ dd if=/dev/zero of=foo bs=4k count=1024 1024+0 records in 1024+0 records out 4194304 bytes (4.2 MB, 4.0 MiB) copied, 0.00456495 s, 919 MB/s $ sync $ dd if=foo of=/dev/null iflag=direct bs=4k 1024+0 records in 1024+0 records out 4194304 bytes (4.2 MB, 4.0 MiB) copied, 0.163079 s, 25.7 MB/s And you'll get a warning like: [ 84.896486] [ cut here ] [ 84.897370] page offset 94199157981184 != start 0 [ 84.898128] WARNING: CPU: 1 PID: 459 at fs/btrfs/inode.c:3119 check_data_csum+0x189/0x260 [btrfs] [ 84.899547] Modules linked in: btrfs blake2b_generic xor pata_acpi ata_piix libata scsi_mod raid6_pq virtio_net net_failover virtio_rng libcrc32c rng_core failover [ 84.901742] CPU: 1 PID: 459 Comm: kworker/u56:2 Not tainted 5.12.0-rc3-00060-ge0cd3910d8cb-dirty #139 [ 84.903205] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS rel-1.14.0-0-g155821a1990b-prebuilt.qemu.org 04/01/2014 [ 84.904875] Workqueue: btrfs-endio btrfs_work_helper [btrfs] [ 84.905749] RIP: 0010:check_data_csum+0x189/0x260 [btrfs] [ 84.906576] Code: 57 11 00 00 0f 85 03 ff ff ff 4c 89 ca 48 c7 c7 50 ba 35 c0 4c 89 44 24 10 48 89 44 24 08 c6 05 04 57 11 00 01 e8 22 e0 cf d4 <0f> 0b 4c 8b 44 24 10 48 8b 44 24 08 e9 d2 fe ff ff 41 8b 45 00 4d [ 84.909288] RSP: 0018:b6e9c164bb98 EFLAGS: 00010282 [ 84.910061] RAX: RBX: e96b84a05f40 RCX: 0001 [ 84.911109] RDX: 8001 RSI: 9573d067 RDI: [ 84.912149] RBP: R08: R09: c000dfff [ 84.913197] R10: 0001 R11: b6e9c164b9c0 R12: [ 84.914247] R13: 9d32a28c8dc0 R14: 9d32ac495e10 R15: 0004 [ 84.915304] FS: () GS:9d399f64() knlGS: [ 84.916478] CS: 0010 DS: ES: CR0: 80050033 [ 84.917340] CR2: 55ad52f97120 CR3: 0001292f4002 CR4: 00370ee0 [ 84.918435] DR0: DR1: DR2: [ 84.919473] DR3: DR6: fffe0ff0 DR7: 0400 [ 84.920515] Call Trace: [ 84.920884] ? find_busiest_group+0x41/0x380 [ 84.921518] ? load_balance+0x176/0xc10 [ 84.922082] ? kvm_sched_clock_read+0x5/0x10 [ 84.922711] ? sched_clock+0x5/0x10 [ 84.923236] btrfs_end_dio_bio+0x2fb/0x310 [btrfs] [ 84.923982] end_workqueue_fn+0x29/0x40 [btrfs] [ 84.924698] btrfs_work_helper+0xc1/0x350 [btrfs] [ 84.925435] process_one_work+0x1c8/0x390 [ 84.926025] ? process_one_work+0x390/0x390 [ 84.926650] worker_thread+0x30/0x370 [ 84.927209] ? process_one_work+0x390/0x390 [ 84.927875] kthread+0x13d/0x160 [ 84.928466] ? kthread_park+0x80/0x80 [ 84.929008] ret_from_fork+0x22/0x30 [ 84.929543] ---[ end trace 4f87c4a13fa476d4 ]--- > > > > Fixes: 265d4ac03fdf ("btrfs: sink parameter start and len to > > check_data_csum") > > Signed-off-by: Omar Sandoval > > --- > > fs/btrfs/inode.c | 14 +- > > 1 file changed, 9 insertions(+), 5 deletions(-) > > > > diff --git a/fs/btrfs/inode.c b/fs/btrfs/inode.c > > index ef6cb7b620d0..d2ece8554416 100644 > > --- a/fs/btrfs/inode.c > > +++ b/fs/btrfs/inode.c > > @@ -2947,11 +2947,13 @@ void btrfs_writepage_endio_finish_ordered(struct > > page *page, u64 start, > >* @bio_offset: offset to the beginning of the bio (in bytes) > >* @page: page where is the data to be verified > >* @pgoff:offset inside the page > > + * @start: logical offset in the file > > Please
Re: [PATCH v8 04/10] btrfs: fix check_data_csum() error message for direct I/O
On 2021/3/17 上午3:43, Omar Sandoval wrote: From: Omar Sandoval Commit 1dae796aabf6 ("btrfs: inode: sink parameter start and len to check_data_csum()") replaced the start parameter to check_data_csum() with page_offset(), but page_offset() is not meaningful for direct I/O pages. Bring back the start parameter. So direct IO pages doesn't have page::index set at all? Any reproducer? I'd like to try to reproduce it first. Fixes: 265d4ac03fdf ("btrfs: sink parameter start and len to check_data_csum") Signed-off-by: Omar Sandoval --- fs/btrfs/inode.c | 14 +- 1 file changed, 9 insertions(+), 5 deletions(-) diff --git a/fs/btrfs/inode.c b/fs/btrfs/inode.c index ef6cb7b620d0..d2ece8554416 100644 --- a/fs/btrfs/inode.c +++ b/fs/btrfs/inode.c @@ -2947,11 +2947,13 @@ void btrfs_writepage_endio_finish_ordered(struct page *page, u64 start, * @bio_offset: offset to the beginning of the bio (in bytes) * @page: page where is the data to be verified * @pgoff:offset inside the page + * @start: logical offset in the file Please add some comment if only for direct IO we need that @start parameter. Thanks, Qu * * The length of such check is always one sector size. */ static int check_data_csum(struct inode *inode, struct btrfs_io_bio *io_bio, - u32 bio_offset, struct page *page, u32 pgoff) + u32 bio_offset, struct page *page, u32 pgoff, + u64 start) { struct btrfs_fs_info *fs_info = btrfs_sb(inode->i_sb); SHASH_DESC_ON_STACK(shash, fs_info->csum_shash); @@ -2978,8 +2980,8 @@ static int check_data_csum(struct inode *inode, struct btrfs_io_bio *io_bio, kunmap_atomic(kaddr); return 0; zeroit: - btrfs_print_data_csum_error(BTRFS_I(inode), page_offset(page) + pgoff, - csum, csum_expected, io_bio->mirror_num); + btrfs_print_data_csum_error(BTRFS_I(inode), start, csum, csum_expected, + io_bio->mirror_num); if (io_bio->device) btrfs_dev_stat_inc_and_print(io_bio->device, BTRFS_DEV_STAT_CORRUPTION_ERRS); @@ -3032,7 +3034,8 @@ int btrfs_verify_data_csum(struct btrfs_io_bio *io_bio, u32 bio_offset, pg_off += sectorsize, bio_offset += sectorsize) { int ret; - ret = check_data_csum(inode, io_bio, bio_offset, page, pg_off); + ret = check_data_csum(inode, io_bio, bio_offset, page, pg_off, + page_offset(page) + pg_off); if (ret < 0) return -EIO; } @@ -7742,7 +7745,8 @@ static blk_status_t btrfs_check_read_dio_bio(struct inode *inode, ASSERT(pgoff < PAGE_SIZE); if (uptodate && (!csum || !check_data_csum(inode, io_bio, - bio_offset, bvec.bv_page, pgoff))) { + bio_offset, bvec.bv_page, + pgoff, start))) { clean_io_failure(fs_info, failure_tree, io_tree, start, bvec.bv_page, btrfs_ino(BTRFS_I(inode)),
[PATCH v8 04/10] btrfs: fix check_data_csum() error message for direct I/O
From: Omar Sandoval Commit 1dae796aabf6 ("btrfs: inode: sink parameter start and len to check_data_csum()") replaced the start parameter to check_data_csum() with page_offset(), but page_offset() is not meaningful for direct I/O pages. Bring back the start parameter. Fixes: 265d4ac03fdf ("btrfs: sink parameter start and len to check_data_csum") Signed-off-by: Omar Sandoval --- fs/btrfs/inode.c | 14 +- 1 file changed, 9 insertions(+), 5 deletions(-) diff --git a/fs/btrfs/inode.c b/fs/btrfs/inode.c index ef6cb7b620d0..d2ece8554416 100644 --- a/fs/btrfs/inode.c +++ b/fs/btrfs/inode.c @@ -2947,11 +2947,13 @@ void btrfs_writepage_endio_finish_ordered(struct page *page, u64 start, * @bio_offset:offset to the beginning of the bio (in bytes) * @page: page where is the data to be verified * @pgoff: offset inside the page + * @start: logical offset in the file * * The length of such check is always one sector size. */ static int check_data_csum(struct inode *inode, struct btrfs_io_bio *io_bio, - u32 bio_offset, struct page *page, u32 pgoff) + u32 bio_offset, struct page *page, u32 pgoff, + u64 start) { struct btrfs_fs_info *fs_info = btrfs_sb(inode->i_sb); SHASH_DESC_ON_STACK(shash, fs_info->csum_shash); @@ -2978,8 +2980,8 @@ static int check_data_csum(struct inode *inode, struct btrfs_io_bio *io_bio, kunmap_atomic(kaddr); return 0; zeroit: - btrfs_print_data_csum_error(BTRFS_I(inode), page_offset(page) + pgoff, - csum, csum_expected, io_bio->mirror_num); + btrfs_print_data_csum_error(BTRFS_I(inode), start, csum, csum_expected, + io_bio->mirror_num); if (io_bio->device) btrfs_dev_stat_inc_and_print(io_bio->device, BTRFS_DEV_STAT_CORRUPTION_ERRS); @@ -3032,7 +3034,8 @@ int btrfs_verify_data_csum(struct btrfs_io_bio *io_bio, u32 bio_offset, pg_off += sectorsize, bio_offset += sectorsize) { int ret; - ret = check_data_csum(inode, io_bio, bio_offset, page, pg_off); + ret = check_data_csum(inode, io_bio, bio_offset, page, pg_off, + page_offset(page) + pg_off); if (ret < 0) return -EIO; } @@ -7742,7 +7745,8 @@ static blk_status_t btrfs_check_read_dio_bio(struct inode *inode, ASSERT(pgoff < PAGE_SIZE); if (uptodate && (!csum || !check_data_csum(inode, io_bio, - bio_offset, bvec.bv_page, pgoff))) { + bio_offset, bvec.bv_page, + pgoff, start))) { clean_io_failure(fs_info, failure_tree, io_tree, start, bvec.bv_page, btrfs_ino(BTRFS_I(inode)), -- 2.30.2