Re: [syzbot] KCSAN: data-race in start_this_handle / start_this_handle

2021-03-19 Thread Paul E. McKenney
On Fri, Mar 19, 2021 at 11:15:42PM +0900, Tetsuo Handa wrote:
> On 2021/03/12 0:54, Marco Elver wrote:
> >> But the more we could have the compiler automatically figure out
> >> things without needing an explicit tag, it would seem to me that this
> >> would be better, since manual tagging is going to be more error-prone.
> > 
> > What you're alluding to here would go much further than a data race
> > detector ("data race" is still just defined by the memory model). The
> > wish that there was a static analysis tool that would automatically
> > understand the "concurrency semantics as intended by the developer" is
> > something that'd be nice to have, but just doesn't seem realistic.
> > Because how can a tool tell what the developer intended, without input
> > from that developer?
> 
> Input from developers is very important for not only compilers and tools
> but also allowing bug-explorers to understand what is happening.
> ext4 currently has
> 
>   possible deadlock in start_this_handle (2)
>   
> https://syzkaller.appspot.com/bug?id=38c060d5757cbc13fdffd46e80557c645fbe79ba
> 
> which even maintainers cannot understand what is happening.
> How can bug-explorers know implicit logic which maintainers believe safe and 
> correct?
> It is possible that some oversight in implicit logic is the cause of
> "possible deadlock in start_this_handle (2)".
> Making implicit assumptions clear helps understanding.

Just to be clear, the above diagnostic is from lockdep rather than KCSAN.

According to the sample crash result, different code paths acquire
jdb2_handle and the __fs_reclaim_map in different orders.  It looks
to me that __fs_reclaim_map isn't really a lock, but rather a mode
indicator.  If so, lockdep should set it up accordingly, perhaps
in a manner similar to rcu_lock_map.

> Will "KCSAN: data-race in start_this_handle / start_this_handle" be addressed 
> by marking?
> syzbot is already waiting for
> "KCSAN: data-race in jbd2_journal_dirty_metadata / 
> jbd2_journal_dirty_metadata" at
> https://syzkaller.appspot.com/bug?id=5eb10023f53097f003e72c6a7c1a6f14b7c22929 
> .

The first thing is to work out what the code should be doing.  What KCSAN
is saying is that a variable is being locklessly updated.  Is it really
OK for that variable to be locklessly updated?  If not, a larger fix
is required.

For more information, please see Marco's LWN series:
https://lwn.net/Articles/816850/ and https://lwn.net/Articles/816854/

Alternatively, you can refer to the documentation being proposed for
the Linux kernel tree:

https://lore.kernel.org/lkml/20210304004543.25364-3-paul...@kernel.org/

> > If there's worry marking accesses is error-prone, then that might be a
> > signal that the concurrency design is too complex (or the developer
> > hasn't considered all cases).
> > 
> > For that reason, we need to mark accesses to tell the compiler and
> > tooling where to expect concurrency, so that 1) the compiler generates
> > correct code, and 2) tooling such as KCSAN can double-check what the
> > developer intended is actually what's happening.
> 
> and 3) bug-explorers can understand what the developers are assuming/missing.

If the above information doesn't help the bug explorers, please let me
know.

Thanx, Paul


Re: [syzbot] KCSAN: data-race in start_this_handle / start_this_handle

2021-03-19 Thread Tetsuo Handa
On 2021/03/12 0:54, Marco Elver wrote:
>> But the more we could have the compiler automatically figure out
>> things without needing an explicit tag, it would seem to me that this
>> would be better, since manual tagging is going to be more error-prone.
> 
> What you're alluding to here would go much further than a data race
> detector ("data race" is still just defined by the memory model). The
> wish that there was a static analysis tool that would automatically
> understand the "concurrency semantics as intended by the developer" is
> something that'd be nice to have, but just doesn't seem realistic.
> Because how can a tool tell what the developer intended, without input
> from that developer?

Input from developers is very important for not only compilers and tools
but also allowing bug-explorers to understand what is happening.
ext4 currently has

  possible deadlock in start_this_handle (2)
  https://syzkaller.appspot.com/bug?id=38c060d5757cbc13fdffd46e80557c645fbe79ba

which even maintainers cannot understand what is happening.
How can bug-explorers know implicit logic which maintainers believe safe and 
correct?
It is possible that some oversight in implicit logic is the cause of
"possible deadlock in start_this_handle (2)".
Making implicit assumptions clear helps understanding.

Will "KCSAN: data-race in start_this_handle / start_this_handle" be addressed 
by marking?
syzbot is already waiting for
"KCSAN: data-race in jbd2_journal_dirty_metadata / jbd2_journal_dirty_metadata" 
at
https://syzkaller.appspot.com/bug?id=5eb10023f53097f003e72c6a7c1a6f14b7c22929 .

> 
> If there's worry marking accesses is error-prone, then that might be a
> signal that the concurrency design is too complex (or the developer
> hasn't considered all cases).
> 
> For that reason, we need to mark accesses to tell the compiler and
> tooling where to expect concurrency, so that 1) the compiler generates
> correct code, and 2) tooling such as KCSAN can double-check what the
> developer intended is actually what's happening.

and 3) bug-explorers can understand what the developers are assuming/missing.



Re: [syzbot] KCSAN: data-race in start_this_handle / start_this_handle

2021-03-11 Thread Marco Elver
+Cc Paul

On Thu, 11 Mar 2021 at 16:30, Theodore Ts'o  wrote:
>
> On Thu, Mar 11, 2021 at 04:08:30PM +0100, Marco Elver wrote:
> > If the outcome of the check does not affect correctness and the code is
> > entirely fault tolerant to the precise value being read, then a
> > data_race(!journal->j_running_transaction) marking here would be fine.
>
> So a very common coding pattern is to check a value w/o the lock, and
> if it looks like we might need to check *with* a lock, we'll grab the
> lock and recheck.  Does KCSAN understand that this sort of thing is
> safe automatically?

It doesn't -- these are concurrency patterns that go way beyond the
scope of a data race detector. The main problem, as always with such
patterns, is that in one case it seems fine, yet in the next it isn't.

> In thie particular case, it's a bit more complicated than that; we're
> checking a value, and then allocating memory, grabbing the spin lock,
> and then re-checking the value, so we don't have to drop the spinlock,
> allocate the memory, grab the lock again, and then rechecking the
> value.  So even if KCSAN catches the simpler case as described above,
> we still might need to explicitly mark the data_race explicitly.

This seems like a variation of double-checked locking.

> But the more we could have the compiler automatically figure out
> things without needing an explicit tag, it would seem to me that this
> would be better, since manual tagging is going to be more error-prone.

What you're alluding to here would go much further than a data race
detector ("data race" is still just defined by the memory model). The
wish that there was a static analysis tool that would automatically
understand the "concurrency semantics as intended by the developer" is
something that'd be nice to have, but just doesn't seem realistic.
Because how can a tool tell what the developer intended, without input
from that developer?

If there's worry marking accesses is error-prone, then that might be a
signal that the concurrency design is too complex (or the developer
hasn't considered all cases).

For that reason, we need to mark accesses to tell the compiler and
tooling where to expect concurrency, so that 1) the compiler generates
correct code, and 2) tooling such as KCSAN can double-check what the
developer intended is actually what's happening.

Thanks,
-- Marco


Re: [syzbot] KCSAN: data-race in start_this_handle / start_this_handle

2021-03-11 Thread Theodore Ts'o
On Thu, Mar 11, 2021 at 04:08:30PM +0100, Marco Elver wrote:
> If the outcome of the check does not affect correctness and the code is
> entirely fault tolerant to the precise value being read, then a
> data_race(!journal->j_running_transaction) marking here would be fine.

So a very common coding pattern is to check a value w/o the lock, and
if it looks like we might need to check *with* a lock, we'll grab the
lock and recheck.  Does KCSAN understand that this sort of thing is
safe automatically?

In thie particular case, it's a bit more complicated than that; we're
checking a value, and then allocating memory, grabbing the spin lock,
and then re-checking the value, so we don't have to drop the spinlock,
allocate the memory, grab the lock again, and then rechecking the
value.  So even if KCSAN catches the simpler case as described above,
we still might need to explicitly mark the data_race explicitly.

But the more we could have the compiler automatically figure out
things without needing an explicit tag, it would seem to me that this
would be better, since manual tagging is going to be more error-prone.

Cheers,

   - Ted


Re: [syzbot] KCSAN: data-race in start_this_handle / start_this_handle

2021-03-11 Thread Marco Elver
On Thu, Mar 11, 2021 at 03:53PM +0100, Dmitry Vyukov wrote:
> On Thu, Mar 11, 2021 at 3:25 PM Jan Kara  wrote:
> >
> > On Thu 11-03-21 02:59:14, syzbot wrote:
> > > HEAD commit:a74e6a01 Merge tag 's390-5.12-3' of 
> > > git://git.kernel.org/p..
> > > git tree:   upstream
> > > console output: https://syzkaller.appspot.com/x/log.txt?x=159f69ecd0
> > > kernel config:  https://syzkaller.appspot.com/x/.config?x=de394bbaade74fb7
> > > dashboard link: 
> > > https://syzkaller.appspot.com/bug?extid=30774a6acf6a2cf6d535
> > > compiler:   Debian clang version 11.0.1-2
> > >
> > > Unfortunately, I don't have any reproducer for this issue yet.
> > >
> > > IMPORTANT: if you fix the issue, please add the following tag to the 
> > > commit:
> > > Reported-by: syzbot+30774a6acf6a2cf6d...@syzkaller.appspotmail.com
> > >
> > > ==
> > > BUG: KCSAN: data-race in start_this_handle / start_this_handle
> > >
> > > write to 0x888103880870 of 8 bytes by task 29956 on cpu 1:
> > >  jbd2_get_transaction fs/jbd2/transaction.c:125 [inline]
> > >  start_this_handle+0xceb/0x1010 fs/jbd2/transaction.c:400
> > >  jbd2__journal_start+0x1fc/0x3f0 fs/jbd2/transaction.c:503
> > >  __ext4_journal_start_sb+0x159/0x310 fs/ext4/ext4_jbd2.c:105
> > >  __ext4_journal_start fs/ext4/ext4_jbd2.h:320 [inline]
> > >  ext4_da_write_begin+0x460/0xaf0 fs/ext4/inode.c:2998
> > >  generic_perform_write+0x196/0x3a0 mm/filemap.c:3575
> > >  ext4_buffered_write_iter+0x2e5/0x3e0 fs/ext4/file.c:269
> > >  ext4_file_write_iter+0x48a/0x10b0 fs/ext4/file.c:502
> > >  call_write_iter include/linux/fs.h:1977 [inline]
> > >  do_iter_readv_writev+0x2cb/0x360 fs/read_write.c:740
> > >  do_iter_write+0x112/0x4c0 fs/read_write.c:866
> > >  vfs_iter_write+0x4c/0x70 fs/read_write.c:907
> > >  iter_file_splice_write+0x40a/0x750 fs/splice.c:689
> > >  do_splice_from fs/splice.c:767 [inline]
> > >  direct_splice_actor+0x80/0xa0 fs/splice.c:936
> > >  splice_direct_to_actor+0x345/0x650 fs/splice.c:891
> > >  do_splice_direct+0xf5/0x170 fs/splice.c:979
> > >  do_sendfile+0x7a6/0xe20 fs/read_write.c:1260
> > >  __do_sys_sendfile64 fs/read_write.c:1319 [inline]
> > >  __se_sys_sendfile64 fs/read_write.c:1311 [inline]
> > >  __x64_sys_sendfile64+0xa9/0x130 fs/read_write.c:1311
> > >  do_syscall_64+0x39/0x80 arch/x86/entry/common.c:46
> > >  entry_SYSCALL_64_after_hwframe+0x44/0xae
> > >
> > > read to 0x888103880870 of 8 bytes by task 29936 on cpu 0:
> > >  start_this_handle+0x1c1/0x1010 fs/jbd2/transaction.c:352
> > >  jbd2__journal_start+0x1fc/0x3f0 fs/jbd2/transaction.c:503
> > >  __ext4_journal_start_sb+0x159/0x310 fs/ext4/ext4_jbd2.c:105
> > >  __ext4_journal_start fs/ext4/ext4_jbd2.h:320 [inline]
> > >  ext4_da_write_begin+0x460/0xaf0 fs/ext4/inode.c:2998
> > >  generic_perform_write+0x196/0x3a0 mm/filemap.c:3575
> > >  ext4_buffered_write_iter+0x2e5/0x3e0 fs/ext4/file.c:269
> > >  ext4_file_write_iter+0x48a/0x10b0 fs/ext4/file.c:502
> > >  call_write_iter include/linux/fs.h:1977 [inline]
> > >  do_iter_readv_writev+0x2cb/0x360 fs/read_write.c:740
> > >  do_iter_write+0x112/0x4c0 fs/read_write.c:866
> > >  vfs_iter_write+0x4c/0x70 fs/read_write.c:907
> > >  iter_file_splice_write+0x40a/0x750 fs/splice.c:689
> > >  do_splice_from fs/splice.c:767 [inline]
> > >  direct_splice_actor+0x80/0xa0 fs/splice.c:936
> > >  splice_direct_to_actor+0x345/0x650 fs/splice.c:891
> > >  do_splice_direct+0xf5/0x170 fs/splice.c:979
> > >  do_sendfile+0x7a6/0xe20 fs/read_write.c:1260
> > >  __do_sys_sendfile64 fs/read_write.c:1319 [inline]
> > >  __se_sys_sendfile64 fs/read_write.c:1311 [inline]
> > >  __x64_sys_sendfile64+0xa9/0x130 fs/read_write.c:1311
> > >  do_syscall_64+0x39/0x80 arch/x86/entry/common.c:46
> > >  entry_SYSCALL_64_after_hwframe+0x44/0xae
> > >
> > > Reported by Kernel Concurrency Sanitizer on:
> > > CPU: 0 PID: 29936 Comm: syz-executor.5 Not tainted 5.12.0-rc2-syzkaller #0
> > > Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS 
> > > Google 01/01/2011
> > > ==
> >
> > So this case is harmless. start_this_handle() does indeed check
> > journal->j_running_transaction without any protection and this is only a
> > racy check to opportunistically preallocate a transaction if we are likely
> > to need it. There was some macro to instruct KCSAN that the read is
> > actually fine, wasn't there?

If the outcome of the check does not affect correctness and the code is
entirely fault tolerant to the precise value being read, then a
data_race(!journal->j_running_transaction) marking here would be fine.

If in doubt, this is the latest summary for concurrent access markings:
https://lkml.kernel.org/r/20210220051000.GA457@paulmck-ThinkPad-P72

> +Marco
> +Tetsuo, did you want to fix it?

Thanks,
-- Marco


Re: [syzbot] KCSAN: data-race in start_this_handle / start_this_handle

2021-03-11 Thread Dmitry Vyukov
On Thu, Mar 11, 2021 at 3:25 PM Jan Kara  wrote:
>
> On Thu 11-03-21 02:59:14, syzbot wrote:
> > HEAD commit:a74e6a01 Merge tag 's390-5.12-3' of git://git.kernel.org/p..
> > git tree:   upstream
> > console output: https://syzkaller.appspot.com/x/log.txt?x=159f69ecd0
> > kernel config:  https://syzkaller.appspot.com/x/.config?x=de394bbaade74fb7
> > dashboard link: https://syzkaller.appspot.com/bug?extid=30774a6acf6a2cf6d535
> > compiler:   Debian clang version 11.0.1-2
> >
> > Unfortunately, I don't have any reproducer for this issue yet.
> >
> > IMPORTANT: if you fix the issue, please add the following tag to the commit:
> > Reported-by: syzbot+30774a6acf6a2cf6d...@syzkaller.appspotmail.com
> >
> > ==
> > BUG: KCSAN: data-race in start_this_handle / start_this_handle
> >
> > write to 0x888103880870 of 8 bytes by task 29956 on cpu 1:
> >  jbd2_get_transaction fs/jbd2/transaction.c:125 [inline]
> >  start_this_handle+0xceb/0x1010 fs/jbd2/transaction.c:400
> >  jbd2__journal_start+0x1fc/0x3f0 fs/jbd2/transaction.c:503
> >  __ext4_journal_start_sb+0x159/0x310 fs/ext4/ext4_jbd2.c:105
> >  __ext4_journal_start fs/ext4/ext4_jbd2.h:320 [inline]
> >  ext4_da_write_begin+0x460/0xaf0 fs/ext4/inode.c:2998
> >  generic_perform_write+0x196/0x3a0 mm/filemap.c:3575
> >  ext4_buffered_write_iter+0x2e5/0x3e0 fs/ext4/file.c:269
> >  ext4_file_write_iter+0x48a/0x10b0 fs/ext4/file.c:502
> >  call_write_iter include/linux/fs.h:1977 [inline]
> >  do_iter_readv_writev+0x2cb/0x360 fs/read_write.c:740
> >  do_iter_write+0x112/0x4c0 fs/read_write.c:866
> >  vfs_iter_write+0x4c/0x70 fs/read_write.c:907
> >  iter_file_splice_write+0x40a/0x750 fs/splice.c:689
> >  do_splice_from fs/splice.c:767 [inline]
> >  direct_splice_actor+0x80/0xa0 fs/splice.c:936
> >  splice_direct_to_actor+0x345/0x650 fs/splice.c:891
> >  do_splice_direct+0xf5/0x170 fs/splice.c:979
> >  do_sendfile+0x7a6/0xe20 fs/read_write.c:1260
> >  __do_sys_sendfile64 fs/read_write.c:1319 [inline]
> >  __se_sys_sendfile64 fs/read_write.c:1311 [inline]
> >  __x64_sys_sendfile64+0xa9/0x130 fs/read_write.c:1311
> >  do_syscall_64+0x39/0x80 arch/x86/entry/common.c:46
> >  entry_SYSCALL_64_after_hwframe+0x44/0xae
> >
> > read to 0x888103880870 of 8 bytes by task 29936 on cpu 0:
> >  start_this_handle+0x1c1/0x1010 fs/jbd2/transaction.c:352
> >  jbd2__journal_start+0x1fc/0x3f0 fs/jbd2/transaction.c:503
> >  __ext4_journal_start_sb+0x159/0x310 fs/ext4/ext4_jbd2.c:105
> >  __ext4_journal_start fs/ext4/ext4_jbd2.h:320 [inline]
> >  ext4_da_write_begin+0x460/0xaf0 fs/ext4/inode.c:2998
> >  generic_perform_write+0x196/0x3a0 mm/filemap.c:3575
> >  ext4_buffered_write_iter+0x2e5/0x3e0 fs/ext4/file.c:269
> >  ext4_file_write_iter+0x48a/0x10b0 fs/ext4/file.c:502
> >  call_write_iter include/linux/fs.h:1977 [inline]
> >  do_iter_readv_writev+0x2cb/0x360 fs/read_write.c:740
> >  do_iter_write+0x112/0x4c0 fs/read_write.c:866
> >  vfs_iter_write+0x4c/0x70 fs/read_write.c:907
> >  iter_file_splice_write+0x40a/0x750 fs/splice.c:689
> >  do_splice_from fs/splice.c:767 [inline]
> >  direct_splice_actor+0x80/0xa0 fs/splice.c:936
> >  splice_direct_to_actor+0x345/0x650 fs/splice.c:891
> >  do_splice_direct+0xf5/0x170 fs/splice.c:979
> >  do_sendfile+0x7a6/0xe20 fs/read_write.c:1260
> >  __do_sys_sendfile64 fs/read_write.c:1319 [inline]
> >  __se_sys_sendfile64 fs/read_write.c:1311 [inline]
> >  __x64_sys_sendfile64+0xa9/0x130 fs/read_write.c:1311
> >  do_syscall_64+0x39/0x80 arch/x86/entry/common.c:46
> >  entry_SYSCALL_64_after_hwframe+0x44/0xae
> >
> > Reported by Kernel Concurrency Sanitizer on:
> > CPU: 0 PID: 29936 Comm: syz-executor.5 Not tainted 5.12.0-rc2-syzkaller #0
> > Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS 
> > Google 01/01/2011
> > ==
>
> So this case is harmless. start_this_handle() does indeed check
> journal->j_running_transaction without any protection and this is only a
> racy check to opportunistically preallocate a transaction if we are likely
> to need it. There was some macro to instruct KCSAN that the read is
> actually fine, wasn't there?

+Marco
+Tetsuo, did you want to fix it?


Re: [syzbot] KCSAN: data-race in start_this_handle / start_this_handle

2021-03-11 Thread Jan Kara
On Thu 11-03-21 02:59:14, syzbot wrote:
> HEAD commit:a74e6a01 Merge tag 's390-5.12-3' of git://git.kernel.org/p..
> git tree:   upstream
> console output: https://syzkaller.appspot.com/x/log.txt?x=159f69ecd0
> kernel config:  https://syzkaller.appspot.com/x/.config?x=de394bbaade74fb7
> dashboard link: https://syzkaller.appspot.com/bug?extid=30774a6acf6a2cf6d535
> compiler:   Debian clang version 11.0.1-2
> 
> Unfortunately, I don't have any reproducer for this issue yet.
> 
> IMPORTANT: if you fix the issue, please add the following tag to the commit:
> Reported-by: syzbot+30774a6acf6a2cf6d...@syzkaller.appspotmail.com
> 
> ==
> BUG: KCSAN: data-race in start_this_handle / start_this_handle
> 
> write to 0x888103880870 of 8 bytes by task 29956 on cpu 1:
>  jbd2_get_transaction fs/jbd2/transaction.c:125 [inline]
>  start_this_handle+0xceb/0x1010 fs/jbd2/transaction.c:400
>  jbd2__journal_start+0x1fc/0x3f0 fs/jbd2/transaction.c:503
>  __ext4_journal_start_sb+0x159/0x310 fs/ext4/ext4_jbd2.c:105
>  __ext4_journal_start fs/ext4/ext4_jbd2.h:320 [inline]
>  ext4_da_write_begin+0x460/0xaf0 fs/ext4/inode.c:2998
>  generic_perform_write+0x196/0x3a0 mm/filemap.c:3575
>  ext4_buffered_write_iter+0x2e5/0x3e0 fs/ext4/file.c:269
>  ext4_file_write_iter+0x48a/0x10b0 fs/ext4/file.c:502
>  call_write_iter include/linux/fs.h:1977 [inline]
>  do_iter_readv_writev+0x2cb/0x360 fs/read_write.c:740
>  do_iter_write+0x112/0x4c0 fs/read_write.c:866
>  vfs_iter_write+0x4c/0x70 fs/read_write.c:907
>  iter_file_splice_write+0x40a/0x750 fs/splice.c:689
>  do_splice_from fs/splice.c:767 [inline]
>  direct_splice_actor+0x80/0xa0 fs/splice.c:936
>  splice_direct_to_actor+0x345/0x650 fs/splice.c:891
>  do_splice_direct+0xf5/0x170 fs/splice.c:979
>  do_sendfile+0x7a6/0xe20 fs/read_write.c:1260
>  __do_sys_sendfile64 fs/read_write.c:1319 [inline]
>  __se_sys_sendfile64 fs/read_write.c:1311 [inline]
>  __x64_sys_sendfile64+0xa9/0x130 fs/read_write.c:1311
>  do_syscall_64+0x39/0x80 arch/x86/entry/common.c:46
>  entry_SYSCALL_64_after_hwframe+0x44/0xae
> 
> read to 0x888103880870 of 8 bytes by task 29936 on cpu 0:
>  start_this_handle+0x1c1/0x1010 fs/jbd2/transaction.c:352
>  jbd2__journal_start+0x1fc/0x3f0 fs/jbd2/transaction.c:503
>  __ext4_journal_start_sb+0x159/0x310 fs/ext4/ext4_jbd2.c:105
>  __ext4_journal_start fs/ext4/ext4_jbd2.h:320 [inline]
>  ext4_da_write_begin+0x460/0xaf0 fs/ext4/inode.c:2998
>  generic_perform_write+0x196/0x3a0 mm/filemap.c:3575
>  ext4_buffered_write_iter+0x2e5/0x3e0 fs/ext4/file.c:269
>  ext4_file_write_iter+0x48a/0x10b0 fs/ext4/file.c:502
>  call_write_iter include/linux/fs.h:1977 [inline]
>  do_iter_readv_writev+0x2cb/0x360 fs/read_write.c:740
>  do_iter_write+0x112/0x4c0 fs/read_write.c:866
>  vfs_iter_write+0x4c/0x70 fs/read_write.c:907
>  iter_file_splice_write+0x40a/0x750 fs/splice.c:689
>  do_splice_from fs/splice.c:767 [inline]
>  direct_splice_actor+0x80/0xa0 fs/splice.c:936
>  splice_direct_to_actor+0x345/0x650 fs/splice.c:891
>  do_splice_direct+0xf5/0x170 fs/splice.c:979
>  do_sendfile+0x7a6/0xe20 fs/read_write.c:1260
>  __do_sys_sendfile64 fs/read_write.c:1319 [inline]
>  __se_sys_sendfile64 fs/read_write.c:1311 [inline]
>  __x64_sys_sendfile64+0xa9/0x130 fs/read_write.c:1311
>  do_syscall_64+0x39/0x80 arch/x86/entry/common.c:46
>  entry_SYSCALL_64_after_hwframe+0x44/0xae
> 
> Reported by Kernel Concurrency Sanitizer on:
> CPU: 0 PID: 29936 Comm: syz-executor.5 Not tainted 5.12.0-rc2-syzkaller #0
> Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS 
> Google 01/01/2011
> ==

So this case is harmless. start_this_handle() does indeed check
journal->j_running_transaction without any protection and this is only a
racy check to opportunistically preallocate a transaction if we are likely
to need it. There was some macro to instruct KCSAN that the read is
actually fine, wasn't there?

Honza
-- 
Jan Kara 
SUSE Labs, CR


[syzbot] KCSAN: data-race in start_this_handle / start_this_handle

2021-03-11 Thread syzbot
Hello,

syzbot found the following issue on:

HEAD commit:a74e6a01 Merge tag 's390-5.12-3' of git://git.kernel.org/p..
git tree:   upstream
console output: https://syzkaller.appspot.com/x/log.txt?x=159f69ecd0
kernel config:  https://syzkaller.appspot.com/x/.config?x=de394bbaade74fb7
dashboard link: https://syzkaller.appspot.com/bug?extid=30774a6acf6a2cf6d535
compiler:   Debian clang version 11.0.1-2

Unfortunately, I don't have any reproducer for this issue yet.

IMPORTANT: if you fix the issue, please add the following tag to the commit:
Reported-by: syzbot+30774a6acf6a2cf6d...@syzkaller.appspotmail.com

==
BUG: KCSAN: data-race in start_this_handle / start_this_handle

write to 0x888103880870 of 8 bytes by task 29956 on cpu 1:
 jbd2_get_transaction fs/jbd2/transaction.c:125 [inline]
 start_this_handle+0xceb/0x1010 fs/jbd2/transaction.c:400
 jbd2__journal_start+0x1fc/0x3f0 fs/jbd2/transaction.c:503
 __ext4_journal_start_sb+0x159/0x310 fs/ext4/ext4_jbd2.c:105
 __ext4_journal_start fs/ext4/ext4_jbd2.h:320 [inline]
 ext4_da_write_begin+0x460/0xaf0 fs/ext4/inode.c:2998
 generic_perform_write+0x196/0x3a0 mm/filemap.c:3575
 ext4_buffered_write_iter+0x2e5/0x3e0 fs/ext4/file.c:269
 ext4_file_write_iter+0x48a/0x10b0 fs/ext4/file.c:502
 call_write_iter include/linux/fs.h:1977 [inline]
 do_iter_readv_writev+0x2cb/0x360 fs/read_write.c:740
 do_iter_write+0x112/0x4c0 fs/read_write.c:866
 vfs_iter_write+0x4c/0x70 fs/read_write.c:907
 iter_file_splice_write+0x40a/0x750 fs/splice.c:689
 do_splice_from fs/splice.c:767 [inline]
 direct_splice_actor+0x80/0xa0 fs/splice.c:936
 splice_direct_to_actor+0x345/0x650 fs/splice.c:891
 do_splice_direct+0xf5/0x170 fs/splice.c:979
 do_sendfile+0x7a6/0xe20 fs/read_write.c:1260
 __do_sys_sendfile64 fs/read_write.c:1319 [inline]
 __se_sys_sendfile64 fs/read_write.c:1311 [inline]
 __x64_sys_sendfile64+0xa9/0x130 fs/read_write.c:1311
 do_syscall_64+0x39/0x80 arch/x86/entry/common.c:46
 entry_SYSCALL_64_after_hwframe+0x44/0xae

read to 0x888103880870 of 8 bytes by task 29936 on cpu 0:
 start_this_handle+0x1c1/0x1010 fs/jbd2/transaction.c:352
 jbd2__journal_start+0x1fc/0x3f0 fs/jbd2/transaction.c:503
 __ext4_journal_start_sb+0x159/0x310 fs/ext4/ext4_jbd2.c:105
 __ext4_journal_start fs/ext4/ext4_jbd2.h:320 [inline]
 ext4_da_write_begin+0x460/0xaf0 fs/ext4/inode.c:2998
 generic_perform_write+0x196/0x3a0 mm/filemap.c:3575
 ext4_buffered_write_iter+0x2e5/0x3e0 fs/ext4/file.c:269
 ext4_file_write_iter+0x48a/0x10b0 fs/ext4/file.c:502
 call_write_iter include/linux/fs.h:1977 [inline]
 do_iter_readv_writev+0x2cb/0x360 fs/read_write.c:740
 do_iter_write+0x112/0x4c0 fs/read_write.c:866
 vfs_iter_write+0x4c/0x70 fs/read_write.c:907
 iter_file_splice_write+0x40a/0x750 fs/splice.c:689
 do_splice_from fs/splice.c:767 [inline]
 direct_splice_actor+0x80/0xa0 fs/splice.c:936
 splice_direct_to_actor+0x345/0x650 fs/splice.c:891
 do_splice_direct+0xf5/0x170 fs/splice.c:979
 do_sendfile+0x7a6/0xe20 fs/read_write.c:1260
 __do_sys_sendfile64 fs/read_write.c:1319 [inline]
 __se_sys_sendfile64 fs/read_write.c:1311 [inline]
 __x64_sys_sendfile64+0xa9/0x130 fs/read_write.c:1311
 do_syscall_64+0x39/0x80 arch/x86/entry/common.c:46
 entry_SYSCALL_64_after_hwframe+0x44/0xae

Reported by Kernel Concurrency Sanitizer on:
CPU: 0 PID: 29936 Comm: syz-executor.5 Not tainted 5.12.0-rc2-syzkaller #0
Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 
01/01/2011
==


---
This report is generated by a bot. It may contain errors.
See https://goo.gl/tpsmEJ for more information about syzbot.
syzbot engineers can be reached at syzkal...@googlegroups.com.

syzbot will keep track of this issue. See:
https://goo.gl/tpsmEJ#status for how to communicate with syzbot.