Re: WARN_ON_ONCE(1) in iomap_dio_actor()

2020-08-17 Thread Christoph Hellwig
On Thu, Aug 13, 2020 at 03:52:13AM -0400, Qian Cai wrote:
> > No sane application would ever do this, it's behaviour as expected,
> > so I don't think there's anything to care about here.
> 
> It looks me the kernel warning is trivial to trigger by an non-root user. 
> Shouldn???t we worry a bit because this could be a DoS for systems which set 
> panic_on_warn?

Yes, we should trigger a WARN_ON on user behavior.  a
pr_info_ratelimited + tracepoint and a good comment is probably the
better solution.


Re: WARN_ON_ONCE(1) in iomap_dio_actor()

2020-08-13 Thread Qian Cai



> On Aug 13, 2020, at 1:44 AM, Dave Chinner  wrote:
> 
> Ok:
> 
> file.fd_write = safe_open("./testfile", O_RDWR|O_CREAT);
> 
> file.fd_read = safe_open("./testfile", O_RDWR|O_CREAT|O_DIRECT);
> 
> file.ptr = safe_mmap(NULL, fsize, PROT_READ|PROT_WRITE, MAP_SHARED,
>file.fd_write, 0);
> 
> So this is all IO to the same inode
> 
> and you loop
> 
> while !done {
> 
>do {
>rc = pread(file.fd_read, file.ptr + read, fsize - read,
>read);
>if (rc > 0)
>read += rc;
>} while (rc > 0);
> 
>rc = safe_fallocate(file.fd_write,
>FALLOC_FL_PUNCH_HOLE | FALLOC_FL_KEEP_SIZE,
>0, fsize);
> }
> 
> On two threads at once?
> 
> So, essentially, you do a DIO read into a mmap()d range from the
> same file, with DIO read ascending and the mmap() range descending,
> then once that is done you hole punch the file and do it again?
> 
> IOWs, this is a racing page_mkwrite()/DIO read workload, and the
> moment the two threads hit the same block of the file with a
> DIO read and a page_mkwrite at the same time, it throws a warning.
> 
> Well, that's completely expected behaviour. DIO is not serialised
> against mmap() access at all, and so if the page_mkwrite occurs
> between the writeback and the iomap_apply() call in the dio path,
> then it will see the delalloc block taht the page-mkwrite allocated.
> 
> No sane application would ever do this, it's behaviour as expected,
> so I don't think there's anything to care about here.

It looks me the kernel warning is trivial to trigger by an non-root user. 
Shouldn’t we worry a bit because this could be a DoS for systems which set 
panic_on_warn?

Re: WARN_ON_ONCE(1) in iomap_dio_actor()

2020-08-12 Thread Dave Chinner
On Mon, Aug 10, 2020 at 10:03:03PM -0400, Qian Cai wrote:
> On Sun, Jul 26, 2020 at 04:24:12PM +0100, Christoph Hellwig wrote:
> > On Fri, Jul 24, 2020 at 02:24:32PM -0400, Qian Cai wrote:
> > > On Fri, Jun 19, 2020 at 05:17:47PM -0700, Matthew Wilcox wrote:
> > > > On Fri, Jun 19, 2020 at 05:17:50PM -0400, Qian Cai wrote:
> > > > > Running a syscall fuzzer by a normal user could trigger this,
> > > > > 
> > > > > [55649.32][T515839] WARNING: CPU: 6 PID: 515839 at 
> > > > > fs/iomap/direct-io.c:391 iomap_dio_actor+0x29c/0x420
> > > > ...
> > > > > 371 static loff_t
> > > > > 372 iomap_dio_actor(struct inode *inode, loff_t pos, loff_t length,
> > > > > 373 void *data, struct iomap *iomap, struct iomap 
> > > > > *srcmap)
> > > > > 374 {
> > > > > 375 struct iomap_dio *dio = data;
> > > > > 376
> > > > > 377 switch (iomap->type) {
> > > > > 378 case IOMAP_HOLE:
> > > > > 379 if (WARN_ON_ONCE(dio->flags & IOMAP_DIO_WRITE))
> > > > > 380 return -EIO;
> > > > > 381 return iomap_dio_hole_actor(length, dio);
> > > > > 382 case IOMAP_UNWRITTEN:
> > > > > 383 if (!(dio->flags & IOMAP_DIO_WRITE))
> > > > > 384 return iomap_dio_hole_actor(length, dio);
> > > > > 385 return iomap_dio_bio_actor(inode, pos, length, 
> > > > > dio, iomap);
> > > > > 386 case IOMAP_MAPPED:
> > > > > 387 return iomap_dio_bio_actor(inode, pos, length, 
> > > > > dio, iomap);
> > > > > 388 case IOMAP_INLINE:
> > > > > 389 return iomap_dio_inline_actor(inode, pos, length, 
> > > > > dio, iomap);
> > > > > 390 default:
> > > > > 391 WARN_ON_ONCE(1);
> > > > > 392 return -EIO;
> > > > > 393 }
> > > > > 394 }
> > > > > 
> > > > > Could that be iomap->type == IOMAP_DELALLOC ? Looking throught the 
> > > > > logs,
> > > > > it contains a few pread64() calls until this happens,
> > > > 
> > > > It _shouldn't_ be able to happen.  XFS writes back ranges which exist
> > > > in the page cache upon seeing an O_DIRECT I/O.  So it's not supposed to
> > > > be possible for there to be an extent which is waiting for the contents
> > > > of the page cache to be written back.
> > > 
> > > Okay, it is IOMAP_DELALLOC. We have,
> > 
> > Can you share the fuzzer?  If we end up with delalloc space here we
> > probably need to fix a bug in the cache invalidation code.
> 
> Here is a simple reproducer (I believe it can also be reproduced using 
> xfstests
> generic/503 on a plain xfs without DAX when SCRATCH_MNT == TEST_DIR),
> 
> # git clone https://gitlab.com/cailca/linux-mm
> # cd linux-mm; make
> # ./random 14

Ok:

file.fd_write = safe_open("./testfile", O_RDWR|O_CREAT);

file.fd_read = safe_open("./testfile", O_RDWR|O_CREAT|O_DIRECT);

file.ptr = safe_mmap(NULL, fsize, PROT_READ|PROT_WRITE, MAP_SHARED,
file.fd_write, 0);

So this is all IO to the same inode

and you loop

while !done {

do {
rc = pread(file.fd_read, file.ptr + read, fsize - read,
read);
if (rc > 0)
read += rc;
} while (rc > 0);

rc = safe_fallocate(file.fd_write,
FALLOC_FL_PUNCH_HOLE | FALLOC_FL_KEEP_SIZE,
0, fsize);
}

On two threads at once?

So, essentially, you do a DIO read into a mmap()d range from the
same file, with DIO read ascending and the mmap() range descending,
then once that is done you hole punch the file and do it again?

IOWs, this is a racing page_mkwrite()/DIO read workload, and the
moment the two threads hit the same block of the file with a
DIO read and a page_mkwrite at the same time, it throws a warning.

Well, that's completely expected behaviour. DIO is not serialised
against mmap() access at all, and so if the page_mkwrite occurs
between the writeback and the iomap_apply() call in the dio path,
then it will see the delalloc block taht the page-mkwrite allocated.

No sane application would ever do this, it's behaviour as expected,
so I don't think there's anything to care about here.

Cheers,

Dave.
-- 
Dave Chinner
da...@fromorbit.com


Re: WARN_ON_ONCE(1) in iomap_dio_actor()

2020-08-10 Thread Qian Cai
On Sun, Jul 26, 2020 at 04:24:12PM +0100, Christoph Hellwig wrote:
> On Fri, Jul 24, 2020 at 02:24:32PM -0400, Qian Cai wrote:
> > On Fri, Jun 19, 2020 at 05:17:47PM -0700, Matthew Wilcox wrote:
> > > On Fri, Jun 19, 2020 at 05:17:50PM -0400, Qian Cai wrote:
> > > > Running a syscall fuzzer by a normal user could trigger this,
> > > > 
> > > > [55649.32][T515839] WARNING: CPU: 6 PID: 515839 at 
> > > > fs/iomap/direct-io.c:391 iomap_dio_actor+0x29c/0x420
> > > ...
> > > > 371 static loff_t
> > > > 372 iomap_dio_actor(struct inode *inode, loff_t pos, loff_t length,
> > > > 373 void *data, struct iomap *iomap, struct iomap 
> > > > *srcmap)
> > > > 374 {
> > > > 375 struct iomap_dio *dio = data;
> > > > 376
> > > > 377 switch (iomap->type) {
> > > > 378 case IOMAP_HOLE:
> > > > 379 if (WARN_ON_ONCE(dio->flags & IOMAP_DIO_WRITE))
> > > > 380 return -EIO;
> > > > 381 return iomap_dio_hole_actor(length, dio);
> > > > 382 case IOMAP_UNWRITTEN:
> > > > 383 if (!(dio->flags & IOMAP_DIO_WRITE))
> > > > 384 return iomap_dio_hole_actor(length, dio);
> > > > 385 return iomap_dio_bio_actor(inode, pos, length, dio, 
> > > > iomap);
> > > > 386 case IOMAP_MAPPED:
> > > > 387 return iomap_dio_bio_actor(inode, pos, length, dio, 
> > > > iomap);
> > > > 388 case IOMAP_INLINE:
> > > > 389 return iomap_dio_inline_actor(inode, pos, length, 
> > > > dio, iomap);
> > > > 390 default:
> > > > 391 WARN_ON_ONCE(1);
> > > > 392 return -EIO;
> > > > 393 }
> > > > 394 }
> > > > 
> > > > Could that be iomap->type == IOMAP_DELALLOC ? Looking throught the logs,
> > > > it contains a few pread64() calls until this happens,
> > > 
> > > It _shouldn't_ be able to happen.  XFS writes back ranges which exist
> > > in the page cache upon seeing an O_DIRECT I/O.  So it's not supposed to
> > > be possible for there to be an extent which is waiting for the contents
> > > of the page cache to be written back.
> > 
> > Okay, it is IOMAP_DELALLOC. We have,
> 
> Can you share the fuzzer?  If we end up with delalloc space here we
> probably need to fix a bug in the cache invalidation code.

Here is a simple reproducer (I believe it can also be reproduced using xfstests
generic/503 on a plain xfs without DAX when SCRATCH_MNT == TEST_DIR),

# git clone https://gitlab.com/cailca/linux-mm
# cd linux-mm; make
# ./random 14
- start: mmap_collision
wrote 20480/20480 bytes at offset 0
20 KiB, 5 ops; 0. sec (673.491 MiB/sec and 172413.7931 ops/sec)
wrote 20480/20480 bytes at offset 0
20 KiB, 5 ops; 0. sec (697.545 MiB/sec and 178571.4286 ops/sec)
wrote 20480/20480 bytes at offset 0
20 KiB, 5 ops; 0. sec (723.380 MiB/sec and 185185.1852 ops/sec)
pread: Input/output error

[ 8944.905010][ T6995] [ cut here ]
[ 8944.911193][ T6995] WARNING: CPU: 4 PID: 6995 at fs/iomap/direct-io.c:392 
iomap_dio_actor+0x319/0x480
[ 8944.920498][ T6995] Modules linked in: nls_ascii nls_cp437 vfat fat kvm_amd 
ses enclosure kvm irqbypass efivars acpi_cpufreq efivarfs ip_tables x_tables 
sd_mod smartpqi scsi_transport_sas tg3 mlx5_core libphy firmware_class 
dm_mirror dm_region_hash dm_log dm_mod
[ 8944.943950][ T6995] CPU: 4 PID: 6995 Comm: random Not tainted 
5.8.0-next-20200810+ #2
[ 8944.951855][ T6995] Hardware name: HPE ProLiant DL385 Gen10/ProLiant DL385 
Gen10, BIOS A40 07/10/2019
[ 8944.961162][ T6995] RIP: 0010:iomap_dio_actor+0x319/0x480
[ 8944.966622][ T6995] Code: 24 f6 43 27 40 0f 84 62 ff ff ff 48 83 c4 20 48 89 
d9 48 89 ea 4c 89 e6 5b 4c 89 ff 5d 41 5c 41 5d 41 5e 41 5f e9 07 ef ff ff <0f> 
0b 48 83 c4 20 48 c7 c0 fb ff ff ff 5b 5d 41 5c 41 5d 41 5e 41
[ 8944.986232][ T6995] RSP: 0018:c90007d1f8d0 EFLAGS: 00010202
[ 8944.992203][ T6995] RAX: 0001 RBX: 88861b681c00 RCX: 
88861b681c00
[ 8945.000105][ T6995] RDX: 192000fa3f33 RSI:  RDI: 
c90007d1f998
[ 8945.008014][ T6995] RBP: 1000 R08: c90007d1f980 R09: 
c90007d1f980
[ 8945.015920][ T6995] R10: a72e72a7 R11: fbfff4e5ce54 R12: 

[ 8945.023805][ T6995] R13: 060e1388 R14: a1f2e080 R15: 
060e1388
[ 8945.031705][ T6995] FS:  7fa50e389700() GS:88905ec0() 
knlGS:
[ 8945.040562][ T6995] CS:  0010 DS:  ES:  CR0: 80050033
[ 8945.047063][ T6995] CR2: 7fa50eb93000 CR3: 0005ad9f2000 CR4: 
003506e0
[ 8945.054952][ T6995] Call Trace:
[ 8945.058147][ T6995]  iomap_apply+0x25a/0xb73
[ 8945.062453][ T6995]  ? iomap_dio_bio_actor+0xde0/0xde0
[ 8945.067650][ T6995]  ? generic_perform_write+0x410/0x410
[ 8945.073003][ T6995]  ? trace_event_raw_event_iomap_apply+0x420/0x420
[ 8945.079424][ T6995]  ? filemap_check_errors+0x51/0xe0
[ 8945.084514][ T6995]  

Re: WARN_ON_ONCE(1) in iomap_dio_actor()

2020-07-26 Thread Qian Cai
On Sun, Jul 26, 2020 at 04:24:12PM +0100, Christoph Hellwig wrote:
> On Fri, Jul 24, 2020 at 02:24:32PM -0400, Qian Cai wrote:
> > On Fri, Jun 19, 2020 at 05:17:47PM -0700, Matthew Wilcox wrote:
> > > On Fri, Jun 19, 2020 at 05:17:50PM -0400, Qian Cai wrote:
> > > > Running a syscall fuzzer by a normal user could trigger this,
> > > > 
> > > > [55649.32][T515839] WARNING: CPU: 6 PID: 515839 at 
> > > > fs/iomap/direct-io.c:391 iomap_dio_actor+0x29c/0x420
> > > ...
> > > > 371 static loff_t
> > > > 372 iomap_dio_actor(struct inode *inode, loff_t pos, loff_t length,
> > > > 373 void *data, struct iomap *iomap, struct iomap 
> > > > *srcmap)
> > > > 374 {
> > > > 375 struct iomap_dio *dio = data;
> > > > 376
> > > > 377 switch (iomap->type) {
> > > > 378 case IOMAP_HOLE:
> > > > 379 if (WARN_ON_ONCE(dio->flags & IOMAP_DIO_WRITE))
> > > > 380 return -EIO;
> > > > 381 return iomap_dio_hole_actor(length, dio);
> > > > 382 case IOMAP_UNWRITTEN:
> > > > 383 if (!(dio->flags & IOMAP_DIO_WRITE))
> > > > 384 return iomap_dio_hole_actor(length, dio);
> > > > 385 return iomap_dio_bio_actor(inode, pos, length, dio, 
> > > > iomap);
> > > > 386 case IOMAP_MAPPED:
> > > > 387 return iomap_dio_bio_actor(inode, pos, length, dio, 
> > > > iomap);
> > > > 388 case IOMAP_INLINE:
> > > > 389 return iomap_dio_inline_actor(inode, pos, length, 
> > > > dio, iomap);
> > > > 390 default:
> > > > 391 WARN_ON_ONCE(1);
> > > > 392 return -EIO;
> > > > 393 }
> > > > 394 }
> > > > 
> > > > Could that be iomap->type == IOMAP_DELALLOC ? Looking throught the logs,
> > > > it contains a few pread64() calls until this happens,
> > > 
> > > It _shouldn't_ be able to happen.  XFS writes back ranges which exist
> > > in the page cache upon seeing an O_DIRECT I/O.  So it's not supposed to
> > > be possible for there to be an extent which is waiting for the contents
> > > of the page cache to be written back.
> > 
> > Okay, it is IOMAP_DELALLOC. We have,
> 
> Can you share the fuzzer?  If we end up with delalloc space here we
> probably need to fix a bug in the cache invalidation code.

This is just a wrapper of the trinity fuzzer.

# git clone https://gitlab.com/cailca/linux-mm
# cd linux-mm; make
# ./random -x 0-100 -f

https://gitlab.com/cailca/linux-mm/-/blob/master/random.c#L1383


Re: WARN_ON_ONCE(1) in iomap_dio_actor()

2020-07-26 Thread Christoph Hellwig
On Fri, Jul 24, 2020 at 02:24:32PM -0400, Qian Cai wrote:
> On Fri, Jun 19, 2020 at 05:17:47PM -0700, Matthew Wilcox wrote:
> > On Fri, Jun 19, 2020 at 05:17:50PM -0400, Qian Cai wrote:
> > > Running a syscall fuzzer by a normal user could trigger this,
> > > 
> > > [55649.32][T515839] WARNING: CPU: 6 PID: 515839 at 
> > > fs/iomap/direct-io.c:391 iomap_dio_actor+0x29c/0x420
> > ...
> > > 371 static loff_t
> > > 372 iomap_dio_actor(struct inode *inode, loff_t pos, loff_t length,
> > > 373 void *data, struct iomap *iomap, struct iomap *srcmap)
> > > 374 {
> > > 375 struct iomap_dio *dio = data;
> > > 376
> > > 377 switch (iomap->type) {
> > > 378 case IOMAP_HOLE:
> > > 379 if (WARN_ON_ONCE(dio->flags & IOMAP_DIO_WRITE))
> > > 380 return -EIO;
> > > 381 return iomap_dio_hole_actor(length, dio);
> > > 382 case IOMAP_UNWRITTEN:
> > > 383 if (!(dio->flags & IOMAP_DIO_WRITE))
> > > 384 return iomap_dio_hole_actor(length, dio);
> > > 385 return iomap_dio_bio_actor(inode, pos, length, dio, 
> > > iomap);
> > > 386 case IOMAP_MAPPED:
> > > 387 return iomap_dio_bio_actor(inode, pos, length, dio, 
> > > iomap);
> > > 388 case IOMAP_INLINE:
> > > 389 return iomap_dio_inline_actor(inode, pos, length, 
> > > dio, iomap);
> > > 390 default:
> > > 391 WARN_ON_ONCE(1);
> > > 392 return -EIO;
> > > 393 }
> > > 394 }
> > > 
> > > Could that be iomap->type == IOMAP_DELALLOC ? Looking throught the logs,
> > > it contains a few pread64() calls until this happens,
> > 
> > It _shouldn't_ be able to happen.  XFS writes back ranges which exist
> > in the page cache upon seeing an O_DIRECT I/O.  So it's not supposed to
> > be possible for there to be an extent which is waiting for the contents
> > of the page cache to be written back.
> 
> Okay, it is IOMAP_DELALLOC. We have,

Can you share the fuzzer?  If we end up with delalloc space here we
probably need to fix a bug in the cache invalidation code.


Re: WARN_ON_ONCE(1) in iomap_dio_actor()

2020-07-24 Thread Qian Cai
On Fri, Jun 19, 2020 at 05:17:47PM -0700, Matthew Wilcox wrote:
> On Fri, Jun 19, 2020 at 05:17:50PM -0400, Qian Cai wrote:
> > Running a syscall fuzzer by a normal user could trigger this,
> > 
> > [55649.32][T515839] WARNING: CPU: 6 PID: 515839 at 
> > fs/iomap/direct-io.c:391 iomap_dio_actor+0x29c/0x420
> ...
> > 371 static loff_t
> > 372 iomap_dio_actor(struct inode *inode, loff_t pos, loff_t length,
> > 373 void *data, struct iomap *iomap, struct iomap *srcmap)
> > 374 {
> > 375 struct iomap_dio *dio = data;
> > 376
> > 377 switch (iomap->type) {
> > 378 case IOMAP_HOLE:
> > 379 if (WARN_ON_ONCE(dio->flags & IOMAP_DIO_WRITE))
> > 380 return -EIO;
> > 381 return iomap_dio_hole_actor(length, dio);
> > 382 case IOMAP_UNWRITTEN:
> > 383 if (!(dio->flags & IOMAP_DIO_WRITE))
> > 384 return iomap_dio_hole_actor(length, dio);
> > 385 return iomap_dio_bio_actor(inode, pos, length, dio, 
> > iomap);
> > 386 case IOMAP_MAPPED:
> > 387 return iomap_dio_bio_actor(inode, pos, length, dio, 
> > iomap);
> > 388 case IOMAP_INLINE:
> > 389 return iomap_dio_inline_actor(inode, pos, length, dio, 
> > iomap);
> > 390 default:
> > 391 WARN_ON_ONCE(1);
> > 392 return -EIO;
> > 393 }
> > 394 }
> > 
> > Could that be iomap->type == IOMAP_DELALLOC ? Looking throught the logs,
> > it contains a few pread64() calls until this happens,
> 
> It _shouldn't_ be able to happen.  XFS writes back ranges which exist
> in the page cache upon seeing an O_DIRECT I/O.  So it's not supposed to
> be possible for there to be an extent which is waiting for the contents
> of the page cache to be written back.

Okay, it is IOMAP_DELALLOC. We have,

[11746.454628][T431855] Page cache invalidation failure on direct I/O.  
Possible data corruption due to collision with buffered I/O!
[11746.466345][T431855] File: /tmp/trinity-testfile2 (deleted) PID: 431855 
Comm: trinity-c54
[11746.474608][T431855] iomap->type = 1, iomap->flags = 2, iomap->length = 
2031616

I noticed the commit 5a9d929d6e13 ("iomap: report collisions between directio
and buffered writes to userspace") started to convert those WARN_ON_ONCE() to
dio_warn_stale_pagecache() indicating that those to be triggered by userspace
programs, so this looks like just another missing place to convert? Apparently,
we don't want non-root users to be able to trigger this warning and kernel
tainted. This?

diff --git a/fs/iomap/direct-io.c b/fs/iomap/direct-io.c
index ec7b78e6feca..7f49292df420 100644
--- a/fs/iomap/direct-io.c
+++ b/fs/iomap/direct-io.c
@@ -388,7 +388,7 @@ iomap_dio_actor(struct inode *inode, loff_t pos, loff_t 
length,
case IOMAP_INLINE:
return iomap_dio_inline_actor(inode, pos, length, dio, iomap);
default:
-   WARN_ON_ONCE(1);
+   dio_warn_stale_pagecache(dio->iocb->ki_filp);
return -EIO;
}
 }

> 
> > [child21:124180] [17] pread64(fd=353, buf=0x0, count=0x59b5, pos=0xe0e0e0e) 
> > = -1 (Illegal seek)
> > [child21:124180] [339] pread64(fd=339, buf=0xbcc4, count=0xbd71, 
> > pos=0xff26) = -1 (Illegal seek)
> > [child21:124627] [136] pread64(fd=69, buf=0xbd29, count=0xee42, 
> > pos=2) = -1 (Illegal seek)
> > [child21:124627] [196] pread64(fd=83, buf=0x1, count=0x62f8, 
> > pos=0x1539) = -1 (Illegal seek)
> > [child21:125127] [154] pread64(fd=345, buf=0xbcc4, count=9332, 
> > pos=0xffbd) = 9332
> > [child21:125169] [188] pread64(fd=69, buf=0xbce9, count=0x4d47, 
> > pos=0) = -1 (Illegal seek)
> > [child21:125169] [227] pread64(fd=345, buf=0x1, count=0xe469, pos=1046) = 
> > -1 (Bad address)
> > [child21:125569] [354] pread64(fd=87, buf=0xbcc5, count=0x4294, 
> > pos=0x16161616) = -1 (Illegal seek)
> > [child21:125569] [655] pread64(fd=341, buf=0xbcc7, count=2210, 
> > pos=0x) = -1 (Illegal seek)
> > [child21:125569] [826] pread64(fd=343, buf=0x8, count=0xeb22, 
> > pos=0xc090c202e598b) = 0
> > [child21:126233] [261] pread64(fd=338, buf=0xbcc4, count=0xe8fe, 
> > pos=105) = -1 (Illegal seek)
> > [child21:126233] [275] pread64(fd=190, buf=0x8, count=0x9c24, pos=116) = -1 
> > (Is a directory)
> > [child21:126882] [32] pread64(fd=86, buf=0xbcc4, count=0x7fc2, 
> > pos=2) = -1 (Illegal seek)
> > [child21:127448] [14] pread64(fd=214, buf=0x4, count=11371, pos=0x9b26) = 0
> > [child21:127489] [70] pread64(fd=339, buf=0xbcc7, count=0xb07a, 
> > pos=8192) = -1 (Illegal seek)
> > [child21:127489] [80] pread64(fd=339, buf=0x0, count=6527, pos=205) = -1 
> > (Illegal seek)
> > [child21:127489] [245] pread64(fd=69, buf=0x8, count=0xbba2, pos=47) = -1 
> > (Illegal seek)
> > [child21:128098] [334] pread64(fd=353, buf=0xbcc9, count=0x4540, 
> > pos=168) = -1 (Illegal seek)
> 

Re: WARN_ON_ONCE(1) in iomap_dio_actor()

2020-06-20 Thread Qian Cai
On Fri, Jun 19, 2020 at 05:17:47PM -0700, Matthew Wilcox wrote:
> On Fri, Jun 19, 2020 at 05:17:50PM -0400, Qian Cai wrote:
> > Running a syscall fuzzer by a normal user could trigger this,
> > 
> > [55649.32][T515839] WARNING: CPU: 6 PID: 515839 at 
> > fs/iomap/direct-io.c:391 iomap_dio_actor+0x29c/0x420
> ...
> > 371 static loff_t
> > 372 iomap_dio_actor(struct inode *inode, loff_t pos, loff_t length,
> > 373 void *data, struct iomap *iomap, struct iomap *srcmap)
> > 374 {
> > 375 struct iomap_dio *dio = data;
> > 376
> > 377 switch (iomap->type) {
> > 378 case IOMAP_HOLE:
> > 379 if (WARN_ON_ONCE(dio->flags & IOMAP_DIO_WRITE))
> > 380 return -EIO;
> > 381 return iomap_dio_hole_actor(length, dio);
> > 382 case IOMAP_UNWRITTEN:
> > 383 if (!(dio->flags & IOMAP_DIO_WRITE))
> > 384 return iomap_dio_hole_actor(length, dio);
> > 385 return iomap_dio_bio_actor(inode, pos, length, dio, 
> > iomap);
> > 386 case IOMAP_MAPPED:
> > 387 return iomap_dio_bio_actor(inode, pos, length, dio, 
> > iomap);
> > 388 case IOMAP_INLINE:
> > 389 return iomap_dio_inline_actor(inode, pos, length, dio, 
> > iomap);
> > 390 default:
> > 391 WARN_ON_ONCE(1);
> > 392 return -EIO;
> > 393 }
> > 394 }
> > 
> > Could that be iomap->type == IOMAP_DELALLOC ? Looking throught the logs,
> > it contains a few pread64() calls until this happens,
> 
> It _shouldn't_ be able to happen.  XFS writes back ranges which exist
> in the page cache upon seeing an O_DIRECT I/O.  So it's not supposed to
> be possible for there to be an extent which is waiting for the contents
> of the page cache to be written back.

BTW, this is rather easy to reproduce where it happens on x86 as well.

[ 1248.397398] Page cache invalidation failure on direct I/O.  Possible data 
corruption due to collision with buffered I/O!
[ 1248.418261] flags: 0xbfffc10200(slab|head)
[ 1248.473270] File: /tmp/trinity-testfile2 (deleted) PID: 18127 Comm: 
trinity-c33
[ 1248.475128] [ cut here ]
[ 1248.475130] WARNING: CPU: 54 PID: 18127 at fs/iomap/direct-io.c:391 
iomap_dio_actor+0x319/0x480
[ 1248.475131] Modules linked in: nls_ascii nls_cp437 vfat fat kvm_intel kvm 
irqbypass efivars efivarfs ip_tables x_tables sd_mod hpsa scsi_transport_sas 
be2net firmware_class dm_mirror dm_region_hash dm_log dm_mod
[ 1248.475150] CPU: 54 PID: 18127 Comm: trinity-c33 Not tainted 
5.8.0-rc1-next-20200618 #1
[ 1248.475151] Hardware name: HP ProLiant BL460c Gen9, BIOS I36 01/22/2018
[ 1248.475152] RIP: 0010:iomap_dio_actor+0x319/0x480
[ 1248.475154] Code: 24 f6 43 27 40 0f 84 62 ff ff ff 48 83 c4 20 48 89 d9 48 
89 ea 4c 89 e6 5b 4c 89 ff 5d 41 5c 41 5d 41 5e 41 5f e9 f7 ee ff ff <0f> 0b 48 
83 c4 20 48 c7 c0 fb ff ff ff 5b 5d 41 5c 41 5d 41 5e 41
[ 1248.475156] RSP: 0018:c9003041f6f8 EFLAGS: 00010202
[ 1248.475158] RAX: 0001 RBX: 888c5e4bdd00 RCX: 888c5e4bdd00
[ 1248.475159] RDX: 192006083ef9 RSI: 001ff000 RDI: c9003041f7c8
[ 1248.475160] RBP: 1000 R08: c9003041f7b0 R09: c9003041f7b0
[ 1248.475162] R10: 888c678b2e1b R11: ed118cf165c3 R12: 001ff000
[ 1248.475163] R13: 001ff000 R14: 888c4ded4f08 R15: 888c4ded4f08
[ 1248.475164] FS:  7f34c3897740() GS:888c6788() 
knlGS:
[ 1248.475165] CS:  0010 DS:  ES:  CR0: 80050033
[ 1248.475166] CR2: 7f34c13ff000 CR3: 000abd5c0004 CR4: 001606e0
[ 1248.475167] Call Trace:
[ 1248.475168]  iomap_apply+0x258/0xb02
[ 1248.475169]  ? iomap_dio_bio_actor+0xdf0/0xdf0
[ 1248.475170]  ? trace_event_raw_event_iomap_apply+0x420/0x420
[ 1248.475171]  ? generic_write_checks+0x320/0x320
[ 1248.475172]  ? ___ratelimit+0x1c6/0x400
[ 1248.475173]  iomap_dio_rw+0x7ac/0xfe0
[ 1248.475174]  ? iomap_dio_bio_actor+0xdf0/0xdf0
[ 1248.475176]  ? iomap_dio_complete+0x710/0x710
[ 1248.475177]  ? down_read_nested+0x114/0x430
[ 1248.475179]  ? downgrade_write+0x3a0/0x3a0
[ 1248.475181]  ? rcu_read_lock_sched_held+0xaa/0xd0
[ 1248.475183]  ? xfs_file_dio_aio_read+0x1d5/0x4c0
[ 1248.475185]  xfs_file_dio_aio_read+0x1d5/0x4c0
[ 1248.475186]  ? lock_downgrade+0x720/0x720
[ 1248.475188]  xfs_file_read_iter+0x3e3/0x6b0
[ 1248.475190]  do_iter_readv_writev+0x472/0x6e0
[ 1248.475192]  ? default_llseek+0x240/0x240
[ 1248.475194]  ? _copy_from_user+0xbe/0x100
[ 1248.475196]  ? lock_acquire+0x1ac/0xaf0
[ 1248.475197]  ? __fdget_pos+0x9c/0xb0
[ 1248.475199]  do_iter_read+0x1eb/0x5a0
[ 1248.475201]  vfs_readv+0xc7/0x130
[ 1248.475203]  ? __mutex_lock+0x4aa/0x1390
[ 1248.475205]  ? rw_copy_check_uvector+0x380/0x380
[ 1248.475206]  ? __fdget_pos+0x9c/0xb0
[ 1248.475208]  ? find_held_lock+0x33/0x1c0
[ 1248.475210]  ? __task_pid_nr_ns+0x1d1/0x3f0
[ 1248.475212]  

Re: WARN_ON_ONCE(1) in iomap_dio_actor()

2020-06-19 Thread Matthew Wilcox
On Fri, Jun 19, 2020 at 05:17:50PM -0400, Qian Cai wrote:
> Running a syscall fuzzer by a normal user could trigger this,
> 
> [55649.32][T515839] WARNING: CPU: 6 PID: 515839 at 
> fs/iomap/direct-io.c:391 iomap_dio_actor+0x29c/0x420
...
> 371 static loff_t
> 372 iomap_dio_actor(struct inode *inode, loff_t pos, loff_t length,
> 373 void *data, struct iomap *iomap, struct iomap *srcmap)
> 374 {
> 375 struct iomap_dio *dio = data;
> 376
> 377 switch (iomap->type) {
> 378 case IOMAP_HOLE:
> 379 if (WARN_ON_ONCE(dio->flags & IOMAP_DIO_WRITE))
> 380 return -EIO;
> 381 return iomap_dio_hole_actor(length, dio);
> 382 case IOMAP_UNWRITTEN:
> 383 if (!(dio->flags & IOMAP_DIO_WRITE))
> 384 return iomap_dio_hole_actor(length, dio);
> 385 return iomap_dio_bio_actor(inode, pos, length, dio, 
> iomap);
> 386 case IOMAP_MAPPED:
> 387 return iomap_dio_bio_actor(inode, pos, length, dio, 
> iomap);
> 388 case IOMAP_INLINE:
> 389 return iomap_dio_inline_actor(inode, pos, length, dio, 
> iomap);
> 390 default:
> 391 WARN_ON_ONCE(1);
> 392 return -EIO;
> 393 }
> 394 }
> 
> Could that be iomap->type == IOMAP_DELALLOC ? Looking throught the logs,
> it contains a few pread64() calls until this happens,

It _shouldn't_ be able to happen.  XFS writes back ranges which exist
in the page cache upon seeing an O_DIRECT I/O.  So it's not supposed to
be possible for there to be an extent which is waiting for the contents
of the page cache to be written back.

> [child21:124180] [17] pread64(fd=353, buf=0x0, count=0x59b5, pos=0xe0e0e0e) = 
> -1 (Illegal seek)
> [child21:124180] [339] pread64(fd=339, buf=0xbcc4, count=0xbd71, 
> pos=0xff26) = -1 (Illegal seek)
> [child21:124627] [136] pread64(fd=69, buf=0xbd29, count=0xee42, 
> pos=2) = -1 (Illegal seek)
> [child21:124627] [196] pread64(fd=83, buf=0x1, count=0x62f8, pos=0x1539) 
> = -1 (Illegal seek)
> [child21:125127] [154] pread64(fd=345, buf=0xbcc4, count=9332, 
> pos=0xffbd) = 9332
> [child21:125169] [188] pread64(fd=69, buf=0xbce9, count=0x4d47, 
> pos=0) = -1 (Illegal seek)
> [child21:125169] [227] pread64(fd=345, buf=0x1, count=0xe469, pos=1046) = -1 
> (Bad address)
> [child21:125569] [354] pread64(fd=87, buf=0xbcc5, count=0x4294, 
> pos=0x16161616) = -1 (Illegal seek)
> [child21:125569] [655] pread64(fd=341, buf=0xbcc7, count=2210, 
> pos=0x) = -1 (Illegal seek)
> [child21:125569] [826] pread64(fd=343, buf=0x8, count=0xeb22, 
> pos=0xc090c202e598b) = 0
> [child21:126233] [261] pread64(fd=338, buf=0xbcc4, count=0xe8fe, 
> pos=105) = -1 (Illegal seek)
> [child21:126233] [275] pread64(fd=190, buf=0x8, count=0x9c24, pos=116) = -1 
> (Is a directory)
> [child21:126882] [32] pread64(fd=86, buf=0xbcc4, count=0x7fc2, pos=2) 
> = -1 (Illegal seek)
> [child21:127448] [14] pread64(fd=214, buf=0x4, count=11371, pos=0x9b26) = 0
> [child21:127489] [70] pread64(fd=339, buf=0xbcc7, count=0xb07a, 
> pos=8192) = -1 (Illegal seek)
> [child21:127489] [80] pread64(fd=339, buf=0x0, count=6527, pos=205) = -1 
> (Illegal seek)
> [child21:127489] [245] pread64(fd=69, buf=0x8, count=0xbba2, pos=47) = -1 
> (Illegal seek)
> [child21:128098] [334] pread64(fd=353, buf=0xbcc9, count=0x4540, 
> pos=168) = -1 (Illegal seek)
> [child21:129079] [157] pread64(fd=422, buf=0x0, count=0x80df, 
> pos=0xdfef6378b650aa) = 0
> [child21:134700] [275] pread64(fd=397, buf=0xbcc5, count=0xdee6, 
> pos=0x887b1e74a2) = -1 (Illegal seek)
> [child21:135042] [7] pread64(fd=80, buf=0x8, count=0xc494, pos=216) = -1 
> (Illegal seek)
> [child21:135056] [188] pread64(fd=430, buf=0xbd09, count=0xbe66, 
> pos=0x3a3a3a3a) = -1 (Illegal seek)
> [child21:135442] [143] pread64(fd=226, buf=0xbd39, count=11558, 
> pos=0x102d) = 0
> [child21:135513] [275] pread64(fd=69, buf=0x4, count=4659, 
> pos=0x486005206c2986) = -1 (Illegal seek)
> [child21:135513] [335] pread64(fd=339, buf=0xbd09, count=0x90fd, 
> pos=253) = -1 (Illegal seek)
> [child21:135513] [392] pread64(fd=76, buf=0xbcc4, count=0xf324, 
> pos=0x5d5d5d5d) = -1 (Illegal seek)
> [child21:135665] [5] pread64(fd=431, buf=0xbcc7, count=10545, 
> pos=16384) = -1 (Illegal seek)
> [child21:135665] [293] pread64(fd=349, buf=0x4, count=0xd2ad, pos=0x200) 
> = -1 (Illegal seek)
> [child21:135790] [99] pread64(fd=76, buf=0x8, count=0x70d7, pos=0x21000440) = 
> -1 (Illegal seek)
> [child21:135790] [149] pread64(fd=70, buf=0xbd5b, count=0x53f3, 
> pos=255) = -1 (Illegal seek)
> [child21:135790] [301] pread64(fd=348, buf=0x4, count=5713, pos=0x6c00401a) = 
> -1 (Illegal seek)
> [child21:136162] [570] pread64(fd=435, buf=0x1, count=11182, pos=248) = -1 
> (Illegal seek)
> [child21:136162] [584] 

WARN_ON_ONCE(1) in iomap_dio_actor()

2020-06-19 Thread Qian Cai
Running a syscall fuzzer by a normal user could trigger this,

[55649.32][T515839] WARNING: CPU: 6 PID: 515839 at fs/iomap/direct-io.c:391 
iomap_dio_actor+0x29c/0x420
[55649.339490][T515839] Modules linked in: vfio_pci vfio_virqfd 
vfio_iommu_type1 vfio processor efivarfs ip_tables x_tables sd_mod mlx5_core 
ahci firmware_class libahci libata dm_mirror dm_region_hash dm_log dm_mod
[55649.358315][T515839] CPU: 6 PID: 515839 Comm: trinity-c21 Not tainted 
5.8.0-rc1-next-20200618 #1
[55649.367100][T515839] Hardware name: HPE Apollo 70 /C01_APACHE_MB 
, BIOS L50_5.13_1.11 06/18/2019
[55649.377624][T515839] pstate: 8049 (Nzcv daif +PAN -UAO BTYPE=--)
[55649.383987][T515839] pc : iomap_dio_actor+0x29c/0x420
[55649.389040][T515839] lr : iomap_apply+0x31c/0x14e8
[55649.393830][T515839] sp : 00953540f670
[55649.397925][T515839] x29: 00953540f670 x28: 
[55649.404022][T515839] x27: 00953540f7b0 x26: a00012113000
[55649.410119][T515839] x25: d904 x24: a00011553400
[55649.416216][T515839] x23: 0095c49d1288 x22: 0095c49d1288
[55649.422313][T515839] x21: 0080 x20: d904
[55649.428416][T515839] x19: 0088f9ad9b80 x18: 
[55649.434516][T515839] x17:  x16: 
[55649.440613][T515839] x15:  x14: 0022
[55649.446709][T515839] x13: 80113ef86764 x12: 1fffe0113ef86763
[55649.452808][T515839] x11: 1fffe0113ef86763 x10: 80113ef86763
[55649.458913][T515839] x9 : dfffa000 x8 : 0089f7c33b1b
[55649.465034][T515839] x7 : 0001 x6 : dfffa000
[55649.471144][T515839] x5 : 00953540f7c8 x4 : 00953540f7b0
[55649.477248][T515839] x3 : 0088f9ad9b80 x2 : 0001
[55649.483346][T515839] x1 : 0001 x0 : 0095c49d1288
[55649.489452][T515839] Call trace:
[55649.492700][T515839]  iomap_dio_actor+0x29c/0x420
[55649.497411][T515839]  iomap_apply+0x31c/0x14e8
iomap_apply at fs/iomap/apply.c:80 (discriminator 4)
[55649.501860][T515839]  iomap_dio_rw+0x600/0xb60
iomap_dio_rw at fs/iomap/direct-io.c:503
[55649.506312][T515839]  xfs_file_dio_aio_read+0x204/0x7a0
xfs_file_dio_aio_read at fs/xfs/xfs_file.c:186
[55649.511543][T515839]  xfs_file_read_iter+0x3f0/0x628
xfs_file_read_iter at fs/xfs/xfs_file.c:260
[55649.516514][T515839]  new_sync_read+0x300/0x4f0
call_read_iter at include/linux/fs.h:1920 (discriminator 1)
(inlined by) new_sync_read at fs/read_write.c:415 (discriminator 1)
[55649.521051][T515839]  __vfs_read+0x88/0xe8
__vfs_read at fs/read_write.c:431
[55649.525152][T515839]  vfs_read+0xd8/0x228
[55649.529167][T515839]  ksys_pread64+0x110/0x158
[55649.533614][T515839]  __arm64_sys_pread64+0x84/0xc0
[55649.538498][T515839]  do_el0_svc+0x124/0x220
[55649.542771][T515839]  el0_sync_handler+0x260/0x408
[55649.547565][T515839]  el0_sync+0x140/0x180
[55649.551662][T515839] irq event stamp: 257496
[55649.555941][T515839] hardirqs last  enabled at (257495): 
[] free_unref_page_list+0x5ec/0x940
[55649.565951][T515839] hardirqs last disabled at (257496): 
[] do_debug_exception+0x304/0x524
[55649.575783][T515839] softirqs last  enabled at (256808): 
[] efi_header_end+0xb38/0x1204
[55649.585355][T515839] softirqs last disabled at (256801): 
[] irq_exit+0x2dc/0x3d0

371 static loff_t
372 iomap_dio_actor(struct inode *inode, loff_t pos, loff_t length,
373 void *data, struct iomap *iomap, struct iomap *srcmap)
374 {
375 struct iomap_dio *dio = data;
376
377 switch (iomap->type) {
378 case IOMAP_HOLE:
379 if (WARN_ON_ONCE(dio->flags & IOMAP_DIO_WRITE))
380 return -EIO;
381 return iomap_dio_hole_actor(length, dio);
382 case IOMAP_UNWRITTEN:
383 if (!(dio->flags & IOMAP_DIO_WRITE))
384 return iomap_dio_hole_actor(length, dio);
385 return iomap_dio_bio_actor(inode, pos, length, dio, iomap);
386 case IOMAP_MAPPED:
387 return iomap_dio_bio_actor(inode, pos, length, dio, iomap);
388 case IOMAP_INLINE:
389 return iomap_dio_inline_actor(inode, pos, length, dio, 
iomap);
390 default:
391 WARN_ON_ONCE(1);
392 return -EIO;
393 }
394 }

Could that be iomap->type == IOMAP_DELALLOC ? Looking throught the logs,
it contains a few pread64() calls until this happens,

[child21:124180] [17] pread64(fd=353, buf=0x0, count=0x59b5, pos=0xe0e0e0e) = 
-1 (Illegal seek)
[child21:124180] [339] pread64(fd=339, buf=0xbcc4, count=0xbd71, 
pos=0xff26) = -1 (Illegal seek)
[child21:124627] [136] pread64(fd=69, buf=0xbd29, count=0xee42, pos=2) 
= -1 (Illegal seek)
[child21:124627] [196] pread64(fd=83, buf=0x1, count=0x62f8, pos=0x1539) = 
-1 (Illegal seek)
[child21:125127] [154] pread64(fd=345, buf=0xbcc4, count=9332, 
pos=0xffbd) = 9332
[child21:125169] [188]