Re: [Cluster-devel] [syzbot] WARNING in __set_page_dirty

2021-08-18 Thread syzbot
syzbot has found a reproducer for the following issue on:

HEAD commit:f8fbb47c6e86 Merge branch 'for-v5.14' of git://git.kernel...
git tree:   upstream
console output: https://syzkaller.appspot.com/x/log.txt?x=125aadf630
kernel config:  https://syzkaller.appspot.com/x/.config?x=e3a20bae04b96ccd
dashboard link: https://syzkaller.appspot.com/bug?extid=0d5b462a6f07447991b3
compiler:   gcc (Debian 10.2.1-6) 10.2.1 20210110, GNU ld (GNU Binutils for 
Debian) 2.35.1
syz repro:  https://syzkaller.appspot.com/x/repro.syz?x=122742ee30
C reproducer:   https://syzkaller.appspot.com/x/repro.c?x=1792538130

IMPORTANT: if you fix the issue, please add the following tag to the commit:
Reported-by: syzbot+0d5b462a6f0744799...@syzkaller.appspotmail.com

NILFS (loop0): segctord starting. Construction interval = 5 seconds, CP 
frequency < 30 seconds
[ cut here ]
WARNING: CPU: 0 PID: 8496 at include/linux/backing-dev.h:283 inode_to_wb 
include/linux/backing-dev.h:283 [inline]
WARNING: CPU: 0 PID: 8496 at include/linux/backing-dev.h:283 
account_page_dirtied mm/page-writeback.c:2435 [inline]
WARNING: CPU: 0 PID: 8496 at include/linux/backing-dev.h:283 
__set_page_dirty+0xace/0x1070 mm/page-writeback.c:2483
Modules linked in:
CPU: 0 PID: 8496 Comm: segctord Not tainted 5.14.0-rc5-syzkaller #0
Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 
01/01/2011
RIP: 0010:inode_to_wb include/linux/backing-dev.h:283 [inline]
RIP: 0010:account_page_dirtied mm/page-writeback.c:2435 [inline]
RIP: 0010:__set_page_dirty+0xace/0x1070 mm/page-writeback.c:2483
Code: a8 01 00 00 be ff ff ff ff 48 8d 78 70 e8 ea 60 8d 07 31 ff 89 c3 89 c6 
e8 cf a6 d8 ff 85 db 0f 85 ac f7 ff ff e8 82 9f d8 ff <0f> 0b e9 a0 f7 ff ff e8 
76 9f d8 ff 4c 8d 75 08 48 b8 00 00 00 00
RSP: 0018:c9000175f8c8 EFLAGS: 00010093
RAX:  RBX:  RCX: 
RDX: 8880263b9c40 RSI: 819d083e RDI: 0003
RBP: ea82dac0 R08:  R09: 0001
R10: 819d0831 R11:  R12: 0293
R13: 888037e60138 R14: 888037e60488 R15: 888037e602e0
FS:  () GS:8880b9c0() knlGS:
CS:  0010 DS:  ES:  CR0: 80050033
CR2: 5593610abbe0 CR3: 16882000 CR4: 00350ef0
Call Trace:
 mark_buffer_dirty+0x49a/0x5e0 fs/buffer.c:1108
 nilfs_btree_propagate_p fs/nilfs2/btree.c:1889 [inline]
 nilfs_btree_propagate+0x4ae/0xea0 fs/nilfs2/btree.c:2085
 nilfs_bmap_propagate+0x73/0x170 fs/nilfs2/bmap.c:337
 nilfs_collect_dat_data+0x45/0xd0 fs/nilfs2/segment.c:625
 nilfs_segctor_apply_buffers+0x14a/0x470 fs/nilfs2/segment.c:1009
 nilfs_segctor_scan_file+0x3e4/0x700 fs/nilfs2/segment.c:1058
 nilfs_segctor_collect_blocks fs/nilfs2/segment.c:1224 [inline]
 nilfs_segctor_collect fs/nilfs2/segment.c:1494 [inline]
 nilfs_segctor_do_construct+0x16ee/0x6b20 fs/nilfs2/segment.c:2036
 nilfs_segctor_construct+0x7a7/0xb30 fs/nilfs2/segment.c:2372
 nilfs_segctor_thread_construct fs/nilfs2/segment.c:2480 [inline]
 nilfs_segctor_thread+0x3c3/0xf90 fs/nilfs2/segment.c:2563
 kthread+0x3e5/0x4d0 kernel/kthread.c:319
 ret_from_fork+0x1f/0x30 arch/x86/entry/entry_64.S:295

Code disassembly (best guess):
   0:   a8 01   test   $0x1,%al
   2:   00 00   add%al,(%rax)
   4:   be ff ff ff ff  mov$0x,%esi
   9:   48 8d 78 70 lea0x70(%rax),%rdi
   d:   e8 ea 60 8d 07  callq  0x78d60fc
  12:   31 ff   xor%edi,%edi
  14:   89 c3   mov%eax,%ebx
  16:   89 c6   mov%eax,%esi
  18:   e8 cf a6 d8 ff  callq  0xffd8a6ec
  1d:   85 db   test   %ebx,%ebx
  1f:   0f 85 ac f7 ff ff   jne0xf7d1
  25:   e8 82 9f d8 ff  callq  0xffd89fac
  2a:   0f 0b   ud2 <-- trapping instruction
  2c:   e9 a0 f7 ff ff  jmpq   0xf7d1
  31:   e8 76 9f d8 ff  callq  0xffd89fac
  36:   4c 8d 75 08 lea0x8(%rbp),%r14
  3a:   48  rex.W
  3b:   b8 00 00 00 00  mov$0x0,%eax



Re: [Cluster-devel] [syzbot] WARNING in __set_page_dirty

2021-07-22 Thread Steven Whitehouse
Hi,

On Thu, 2021-07-22 at 08:16 -0500, Bob Peterson wrote:
> On 7/21/21 4:58 PM, Andrew Morton wrote:
> > (cc gfs2 maintainers)
> > 
> > On Tue, 20 Jul 2021 19:07:25 -0700 syzbot <
> > syzbot+0d5b462a6f0744799...@syzkaller.appspotmail.com> wrote:
> > 
> > > Hello,
> > > 
> > > syzbot found the following issue on:
> > > 
> > > HEAD commit:d936eb238744 Revert "Makefile: Enable -Wimplicit-
> > > fallthrou..
> > > git tree:   upstream
> > > console output: 
> > > https://syzkaller.appspot.com/x/log.txt?x=1512834a30
> > > kernel config:  
> > > https://syzkaller.appspot.com/x/.config?x=f1b998c1afc13578
> > > dashboard link: 
> > > https://syzkaller.appspot.com/bug?extid=0d5b462a6f07447991b3
> > > userspace arch: i386
> > > 
> > > Unfortunately, I don't have any reproducer for this issue yet.
> > > 
> > > IMPORTANT: if you fix the issue, please add the following tag to
> > > the commit:
> > > Reported-by: 
> > > syzbot+0d5b462a6f0744799...@syzkaller.appspotmail.com
> > > 
> > > [ cut here ]
> > > WARNING: CPU: 0 PID: 8696 at include/linux/backing-dev.h:283
> > > inode_to_wb include/linux/backing-dev.h:283 [inline]
> > > WARNING: CPU: 0 PID: 8696 at include/linux/backing-dev.h:283
> > > account_page_dirtied mm/page-writeback.c:2435 [inline]
> > > WARNING: CPU: 0 PID: 8696 at include/linux/backing-dev.h:283
> > > __set_page_dirty+0xace/0x1070 mm/page-writeback.c:2483
> >  
> 
> Okay, sorry for the brain fart earlier. After taking a better look, I
> know exactly what this is.
> This goes back to this discussion from April 2018:
> 
> https://listman.redhat.com/archives/cluster-devel/2018-April/msg00017.html
> 
> in which Jan Kara pointed out that:
> 
> "The problem is we really do expect mapping->host->i_mapping ==
> mapping as
> we pass mapping and inode interchangebly in the mm code. The
> address_space
> and inodes are separate structures because you can have many inodes
> pointing to one address space (block devices). However it is not
> allowed
> for several address_spaces to point to one inode!"
> The problem is that GFS2 keeps separate address spaces for its
> glocks, and they
> don't correspond 1:1 to any inode. So mapping->host is not really an
> inode for these,
> and there's really almost no relation between the glock->mapping and
> the inode it
> points to.
> 
> Even in the recent past, GFS2 did this for all metadata for both its
> media-backed glocks:
> resource groups and inodes.
> 
> I recently posted a patch set to cluster-devel ("gfs2: replace
> sd_aspace with sd_inode" -
> https://listman.redhat.com/archives/cluster-devel/2021-July/msg00066.html) in
> which
> I fixed half the problem, which is the resource group case.
> 
> Unfortunately, for inode glocks it gets a lot trickier and I haven't
> found a proper solution.
> But as I said, it's been a known issue for several years now. The
> errors only appear
> if LOCKDEP is turned on. It would be ideal if address spaces were
> treated as fully
> independent from their inodes, but no one seemed to jump on that
> idea, nor even try to
> explain why we make the assumptions Jan Kara pointed out.
> 
> In the meantime, I'll keep looking for a more proper solution. This
> won't be an easy
> thing to fix or I would have already fixed it.
> 
> Regards,
> 
> Bob Peterson
> 
> 

The reason for having address_spaces pointed to by many inodes is to
allow for stackable filesytems so that you can make the file content
available on the upper layer by just pointing the upper layer inode at
the lower layer address_space. That is presumably what Jan is thinking
of.

This however seems to be an issue with a page flag, so it isn't clear
why that would relate to the address_space? If the page is metadata
which would be the most usual case for something being unpinned, then
that page should definitely be up to date.

Looking back at the earlier rgrp fix mentioned above, the fix is not
unreasonable since there only needs to be a single inode to contain all
the rgrps. For the inode metadata that is not the case, there is a one
to one mapping between inodes and metadata address_spaces, and if the
working assumption is that multiple address_spaces per inode is not
allowed, then I think that has changed over time. I'm pretty sure that
I had checked the expectations way back when we adopted this solution,
and that there were no issues with the multiple address_spaces per
inode case. We definitely don't want to go back to adding an additional
struct inode structure (which does nothing except take up space!) to
each "real" inode in cache, because it is a big overhead in case of a
filesystem with many small files.

Still if this is only a lockdep issue, then we likely have some time to
figure out a good long term solution,

Steve.





Re: [Cluster-devel] [syzbot] WARNING in __set_page_dirty

2021-07-22 Thread Bob Peterson

On 7/21/21 4:58 PM, Andrew Morton wrote:

(cc gfs2 maintainers)

On Tue, 20 Jul 2021 19:07:25 -0700 syzbot 
 wrote:


Hello,

syzbot found the following issue on:

HEAD commit:d936eb238744 Revert "Makefile: Enable -Wimplicit-fallthrou..
git tree:   upstream
console output: https://syzkaller.appspot.com/x/log.txt?x=1512834a30
kernel config:  https://syzkaller.appspot.com/x/.config?x=f1b998c1afc13578
dashboard link: https://syzkaller.appspot.com/bug?extid=0d5b462a6f07447991b3
userspace arch: i386

Unfortunately, I don't have any reproducer for this issue yet.

IMPORTANT: if you fix the issue, please add the following tag to the commit:
Reported-by: syzbot+0d5b462a6f0744799...@syzkaller.appspotmail.com

[ cut here ]
WARNING: CPU: 0 PID: 8696 at include/linux/backing-dev.h:283 inode_to_wb 
include/linux/backing-dev.h:283 [inline]
WARNING: CPU: 0 PID: 8696 at include/linux/backing-dev.h:283 
account_page_dirtied mm/page-writeback.c:2435 [inline]
WARNING: CPU: 0 PID: 8696 at include/linux/backing-dev.h:283 
__set_page_dirty+0xace/0x1070 mm/page-writeback.c:2483


Okay, sorry for the brain fart earlier. After taking a better look, I 
know exactly what this is.

This goes back to this discussion from April 2018:

https://listman.redhat.com/archives/cluster-devel/2018-April/msg00017.html

in which Jan Kara pointed out that:

"The problem is we really do expect mapping->host->i_mapping == mapping as
we pass mapping and inode interchangebly in the mm code. The address_space
and inodes are separate structures because you can have many inodes
pointing to one address space (block devices). However it is not allowed
for several address_spaces to point to one inode!"

The problem is that GFS2 keeps separate address spaces for its glocks, 
and they
don't correspond 1:1 to any inode. So mapping->host is not really an 
inode for these,
and there's really almost no relation between the glock->mapping and the 
inode it

points to.

Even in the recent past, GFS2 did this for all metadata for both its 
media-backed glocks:

resource groups and inodes.

I recently posted a patch set to cluster-devel ("gfs2: replace sd_aspace 
with sd_inode" -
https://listman.redhat.com/archives/cluster-devel/2021-July/msg00066.html) 
in which

I fixed half the problem, which is the resource group case.

Unfortunately, for inode glocks it gets a lot trickier and I haven't 
found a proper solution.
But as I said, it's been a known issue for several years now. The errors 
only appear
if LOCKDEP is turned on. It would be ideal if address spaces were 
treated as fully
independent from their inodes, but no one seemed to jump on that idea, 
nor even try to

explain why we make the assumptions Jan Kara pointed out.

In the meantime, I'll keep looking for a more proper solution. This 
won't be an easy

thing to fix or I would have already fixed it.

Regards,

Bob Peterson




Re: [Cluster-devel] [syzbot] WARNING in __set_page_dirty

2021-07-22 Thread Bob Peterson

On 7/21/21 4:58 PM, Andrew Morton wrote:

(cc gfs2 maintainers)

On Tue, 20 Jul 2021 19:07:25 -0700 syzbot 
 wrote:


Hello,

syzbot found the following issue on:

HEAD commit:d936eb238744 Revert "Makefile: Enable -Wimplicit-fallthrou..
git tree:   upstream
console output: https://syzkaller.appspot.com/x/log.txt?x=1512834a30
kernel config:  https://syzkaller.appspot.com/x/.config?x=f1b998c1afc13578
dashboard link: https://syzkaller.appspot.com/bug?extid=0d5b462a6f07447991b3
userspace arch: i386

Unfortunately, I don't have any reproducer for this issue yet.

IMPORTANT: if you fix the issue, please add the following tag to the commit:
Reported-by: syzbot+0d5b462a6f0744799...@syzkaller.appspotmail.com

[ cut here ]
WARNING: CPU: 0 PID: 8696 at include/linux/backing-dev.h:283 inode_to_wb 
include/linux/backing-dev.h:283 [inline]
WARNING: CPU: 0 PID: 8696 at include/linux/backing-dev.h:283 
account_page_dirtied mm/page-writeback.c:2435 [inline]
WARNING: CPU: 0 PID: 8696 at include/linux/backing-dev.h:283 
__set_page_dirty+0xace/0x1070 mm/page-writeback.c:2483
Modules linked in:
CPU: 0 PID: 8696 Comm: syz-executor.0 Not tainted 5.14.0-rc1-syzkaller #0
Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS 1.14.0-2 04/01/2014
RIP: 0010:inode_to_wb include/linux/backing-dev.h:283 [inline]
RIP: 0010:account_page_dirtied mm/page-writeback.c:2435 [inline]
RIP: 0010:__set_page_dirty+0xace/0x1070 mm/page-writeback.c:2483
Code: a8 01 00 00 be ff ff ff ff 48 8d 78 70 e8 0a bf 8c 07 31 ff 89 c3 89 c6 e8 3f 
af d8 ff 85 db 0f 85 ac f7 ff ff e8 f2 a7 d8 ff <0f> 0b e9 a0 f7 ff ff e8 e6 a7 
d8 ff 4c 8d 75 08 48 b8 00 00 00 00
RSP: :c9e578a0 EFLAGS: 00010093
RAX:  RBX:  RCX: 
RDX: 888013d71c40 RSI: 819cdfce RDI: 0003
RBP: ea0001de0240 R08:  R09: 888019819e07
R10: 819cdfc1 R11:  R12: 0293
R13: 888078a38c90 R14: 888019819e00 R15: 888019819c58
FS:  () GS:88802ca0(0063) knlGS:09b20380
CS:  0010 DS: 002b ES: 002b CR0: 80050033
CR2: 7fd805161390 CR3: 4c16a000 CR4: 00150ef0
DR0:  DR1:  DR2: 
DR3:  DR6: fffe0ff0 DR7: 0400
Call Trace:
  mark_buffer_dirty+0x49a/0x5e0 fs/buffer.c:1108
  gfs2_unpin+0x123/0xd10 fs/gfs2/lops.c:111
  buf_lo_after_commit+0x140/0x210 fs/gfs2/lops.c:750
  lops_after_commit fs/gfs2/lops.h:49 [inline]
  gfs2_log_flush+0x162b/0x2940 fs/gfs2/log.c:1108
  do_sync+0x5ab/0xcd0 fs/gfs2/quota.c:967
  gfs2_quota_sync+0x2e2/0x660 fs/gfs2/quota.c:1310
  gfs2_sync_fs+0x40/0xb0 fs/gfs2/super.c:711
  __sync_filesystem fs/sync.c:39 [inline]

Seems that gfs2_unpin() is running mark_buffer_dirty() against a bh
which is attached to a non-upto-date page.

Hmm. That mark_buffer_dirty has been there since 2007, so this will 
require some analysis.

A reproducer would be helpful, since we've never seen this before.

Bob Peterson




Re: [Cluster-devel] [syzbot] WARNING in __set_page_dirty

2021-07-21 Thread Andrew Morton
(cc gfs2 maintainers)

On Tue, 20 Jul 2021 19:07:25 -0700 syzbot 
 wrote:

> Hello,
> 
> syzbot found the following issue on:
> 
> HEAD commit:d936eb238744 Revert "Makefile: Enable -Wimplicit-fallthrou..
> git tree:   upstream
> console output: https://syzkaller.appspot.com/x/log.txt?x=1512834a30
> kernel config:  https://syzkaller.appspot.com/x/.config?x=f1b998c1afc13578
> dashboard link: https://syzkaller.appspot.com/bug?extid=0d5b462a6f07447991b3
> userspace arch: i386
> 
> Unfortunately, I don't have any reproducer for this issue yet.
> 
> IMPORTANT: if you fix the issue, please add the following tag to the commit:
> Reported-by: syzbot+0d5b462a6f0744799...@syzkaller.appspotmail.com
> 
> [ cut here ]
> WARNING: CPU: 0 PID: 8696 at include/linux/backing-dev.h:283 inode_to_wb 
> include/linux/backing-dev.h:283 [inline]
> WARNING: CPU: 0 PID: 8696 at include/linux/backing-dev.h:283 
> account_page_dirtied mm/page-writeback.c:2435 [inline]
> WARNING: CPU: 0 PID: 8696 at include/linux/backing-dev.h:283 
> __set_page_dirty+0xace/0x1070 mm/page-writeback.c:2483
> Modules linked in:
> CPU: 0 PID: 8696 Comm: syz-executor.0 Not tainted 5.14.0-rc1-syzkaller #0
> Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS 1.14.0-2 04/01/2014
> RIP: 0010:inode_to_wb include/linux/backing-dev.h:283 [inline]
> RIP: 0010:account_page_dirtied mm/page-writeback.c:2435 [inline]
> RIP: 0010:__set_page_dirty+0xace/0x1070 mm/page-writeback.c:2483
> Code: a8 01 00 00 be ff ff ff ff 48 8d 78 70 e8 0a bf 8c 07 31 ff 89 c3 89 c6 
> e8 3f af d8 ff 85 db 0f 85 ac f7 ff ff e8 f2 a7 d8 ff <0f> 0b e9 a0 f7 ff ff 
> e8 e6 a7 d8 ff 4c 8d 75 08 48 b8 00 00 00 00
> RSP: :c9e578a0 EFLAGS: 00010093
> RAX:  RBX:  RCX: 
> RDX: 888013d71c40 RSI: 819cdfce RDI: 0003
> RBP: ea0001de0240 R08:  R09: 888019819e07
> R10: 819cdfc1 R11:  R12: 0293
> R13: 888078a38c90 R14: 888019819e00 R15: 888019819c58
> FS:  () GS:88802ca0(0063) knlGS:09b20380
> CS:  0010 DS: 002b ES: 002b CR0: 80050033
> CR2: 7fd805161390 CR3: 4c16a000 CR4: 00150ef0
> DR0:  DR1:  DR2: 
> DR3:  DR6: fffe0ff0 DR7: 0400
> Call Trace:
>  mark_buffer_dirty+0x49a/0x5e0 fs/buffer.c:1108
>  gfs2_unpin+0x123/0xd10 fs/gfs2/lops.c:111
>  buf_lo_after_commit+0x140/0x210 fs/gfs2/lops.c:750
>  lops_after_commit fs/gfs2/lops.h:49 [inline]
>  gfs2_log_flush+0x162b/0x2940 fs/gfs2/log.c:1108
>  do_sync+0x5ab/0xcd0 fs/gfs2/quota.c:967
>  gfs2_quota_sync+0x2e2/0x660 fs/gfs2/quota.c:1310
>  gfs2_sync_fs+0x40/0xb0 fs/gfs2/super.c:711
>  __sync_filesystem fs/sync.c:39 [inline]

Seems that gfs2_unpin() is running mark_buffer_dirty() against a bh
which is attached to a non-upto-date page.

>  sync_filesystem fs/sync.c:64 [inline]
>  sync_filesystem+0x105/0x260 fs/sync.c:48
>  generic_shutdown_super+0x70/0x370 fs/super.c:448
>  kill_block_super+0x97/0xf0 fs/super.c:1395
>  gfs2_kill_sb+0x104/0x160 fs/gfs2/ops_fstype.c:1682
>  deactivate_locked_super+0x94/0x160 fs/super.c:335
>  deactivate_super+0xad/0xd0 fs/super.c:366
>  cleanup_mnt+0x3a2/0x540 fs/namespace.c:1136
>  task_work_run+0xdd/0x1a0 kernel/task_work.c:164
>  tracehook_notify_resume include/linux/tracehook.h:189 [inline]
>  exit_to_user_mode_loop kernel/entry/common.c:175 [inline]
>  exit_to_user_mode_prepare+0x27e/0x290 kernel/entry/common.c:209
>  __syscall_exit_to_user_mode_work kernel/entry/common.c:291 [inline]
>  syscall_exit_to_user_mode+0x19/0x60 kernel/entry/common.c:302
>  __do_fast_syscall_32+0x72/0xf0 arch/x86/entry/common.c:181
>  do_fast_syscall_32+0x2f/0x70 arch/x86/entry/common.c:203
>  entry_SYSENTER_compat_after_hwframe+0x4d/0x5c
> RIP: 0023:0xf7f86549
> Code: 03 74 c0 01 10 05 03 74 b8 01 10 06 03 74 b4 01 10 07 03 74 b0 01 10 08 
> 03 74 d8 01 00 00 00 00 00 51 52 55 89 e5 0f 34 cd 80 <5d> 5a 59 c3 90 90 90 
> 90 8d b4 26 00 00 00 00 8d b4 26 00 00 00 00
> RSP: 002b:ffeb89bc EFLAGS: 0296 ORIG_RAX: 0034
> RAX:  RBX: ffeb8a60 RCX: 0002
> RDX: 0816c000 RSI:  RDI: 080ea118
> RBP: ffeb8a60 R08:  R09: 
> R10:  R11:  R12: 
> R13:  R14:  R15: 
>