Re: [Cluster-devel] WARNING in account_page_dirtied

2018-04-04 Thread Jan Kara
Hi,

On Wed 04-04-18 10:24:48, Steven Whitehouse wrote:
> On 03/04/18 13:05, Jan Kara wrote:
> > Hello,
> > 
> > On Sun 01-04-18 10:01:02, syzbot wrote:
> > > syzbot hit the following crash on upstream commit
> > > 10b84daddbec72c6b440216a69de9a9605127f7a (Sat Mar 31 17:59:00 2018 +)
> > > Merge branch 'perf-urgent-for-linus' of
> > > git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
> > > syzbot dashboard link:
> > > https://syzkaller.appspot.com/bug?extid=b7772c65a1d88bfd8fca
> > > 
> > > C reproducer: https://syzkaller.appspot.com/x/repro.c?id=5705587757154304
> > > syzkaller reproducer:
> > > https://syzkaller.appspot.com/x/repro.syz?id=5644332530925568
> > > Raw console output:
> > > https://syzkaller.appspot.com/x/log.txt?id=5472755969425408
> > > Kernel config:
> > > https://syzkaller.appspot.com/x/.config?id=-2760467897697295172
> > > compiler: gcc (GCC) 7.1.1 20170620
> > > 
> > > IMPORTANT: if you fix the bug, please add the following tag to the commit:
> > > Reported-by: syzbot+b7772c65a1d88bfd8...@syzkaller.appspotmail.com
> > > It will help syzbot understand when the bug is fixed. See footer for
> > > details.
> > > If you forward the report, please keep this part and the footer.
> > > 
> > > gfs2: fsid=loop0.0: jid=0, already locked for use
> > > gfs2: fsid=loop0.0: jid=0: Looking at journal...
> > > gfs2: fsid=loop0.0: jid=0: Done
> > > gfs2: fsid=loop0.0: first mount done, others may mount
> > > gfs2: fsid=loop0.0: found 1 quota changes
> > > WARNING: CPU: 0 PID: 4469 at ./include/linux/backing-dev.h:341 inode_to_wb
> > > include/linux/backing-dev.h:338 [inline]
> > > WARNING: CPU: 0 PID: 4469 at ./include/linux/backing-dev.h:341
> > > account_page_dirtied+0x8f9/0xcb0 mm/page-writeback.c:2416
> > > Kernel panic - not syncing: panic_on_warn set ...
> > > 
> > > CPU: 0 PID: 4469 Comm: syzkaller368843 Not tainted 4.16.0-rc7+ #9
> > > Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS
> > > Google 01/01/2011
> > > Call Trace:
> > >   __dump_stack lib/dump_stack.c:17 [inline]
> > >   dump_stack+0x194/0x24d lib/dump_stack.c:53
> > >   panic+0x1e4/0x41c kernel/panic.c:183
> > >   __warn+0x1dc/0x200 kernel/panic.c:547
> > >   report_bug+0x1f4/0x2b0 lib/bug.c:186
> > >   fixup_bug.part.10+0x37/0x80 arch/x86/kernel/traps.c:178
> > >   fixup_bug arch/x86/kernel/traps.c:247 [inline]
> > >   do_error_trap+0x2d7/0x3e0 arch/x86/kernel/traps.c:296
> > >   do_invalid_op+0x1b/0x20 arch/x86/kernel/traps.c:315
> > >   invalid_op+0x1b/0x40 arch/x86/entry/entry_64.S:986
> > > RIP: 0010:inode_to_wb include/linux/backing-dev.h:338 [inline]
> > > RIP: 0010:account_page_dirtied+0x8f9/0xcb0 mm/page-writeback.c:2416
> > > RSP: 0018:8801d966e5c0 EFLAGS: 00010093
> > > RAX: 8801acb7e600 RBX: 11003b2cdcba RCX: 818f47a9
> > > RDX:  RSI: 8801d3338148 RDI: 0082
> > > RBP: 8801d966e698 R08: 11003b2cdc13 R09: 000c
> > > R10: 8801d966e558 R11: 0002 R12: 8801c96f0368
> > > R13: ea0006b12780 R14: 8801c96f01d8 R15: 8801c96f01d8
> > >   __set_page_dirty+0x100/0x4b0 fs/buffer.c:605
> > >   mark_buffer_dirty+0x454/0x5d0 fs/buffer.c:1126
> > Huh, I don't see how this could possibly happen. The warning is:
> > 
> >  WARN_ON_ONCE(debug_locks &&
> >   (!lockdep_is_held(>i_lock) &&
> >!lockdep_is_held(>i_mapping->tree_lock) &&
> >!lockdep_is_held(>i_wb->list_lock)));
> > 
> > Now __set_page_dirty() which called account_page_dirtied() just did:
> > 
> > spin_lock_irqsave(>tree_lock, flags);
> > 
> > Now the fact is that account_page_dirtied() actually checks
> > mapping->host->i_mapping->tree_lock so if mapping->host->i_mapping doesn't
> > get us back to 'mapping', that would explain the warning. But then
> > something would have to be very wrong in the GFS2 land... Adding some GFS2
> > related CCs just in case they have some idea.
> So I looked at this for some time trying to work out what is going on. I'm
> sill not 100% sure now, but lets see if we can figure it out
> 
> The stack trace shows a call path to the end of the journal flush code where
> we are unpinning pages that have been through the journal. Assuming that
> jdata is not in use (it is used for some internal files, even if it is not
> selected by the user) then it is most likely that this applies to a metadata
> page.
> 
> For recent gfs2, all the metadata pages are kept in an address space which
> for inodes is in the relevant glock, and for resource groups is a single
> address space kept for only that purpose in the super block. In both of
> those cases the mapping->host points to the block device inode. Since the
> inode's mapping->host reflects only the block device address space (unused
> by gfs2) we would not expect it to point back to the relevant address space.
> 
> As far as I can tell this usage is ok, since it doesn't make much sense to
> require lots 

Re: [Cluster-devel] WARNING in account_page_dirtied

2018-04-04 Thread Steven Whitehouse

Hi,


On 04/04/18 13:36, Jan Kara wrote:

Hi,

On Wed 04-04-18 10:24:48, Steven Whitehouse wrote:

On 03/04/18 13:05, Jan Kara wrote:

Hello,

On Sun 01-04-18 10:01:02, syzbot wrote:

syzbot hit the following crash on upstream commit
10b84daddbec72c6b440216a69de9a9605127f7a (Sat Mar 31 17:59:00 2018 +)
Merge branch 'perf-urgent-for-linus' of
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
syzbot dashboard link:
https://syzkaller.appspot.com/bug?extid=b7772c65a1d88bfd8fca

C reproducer: https://syzkaller.appspot.com/x/repro.c?id=5705587757154304
syzkaller reproducer:
https://syzkaller.appspot.com/x/repro.syz?id=5644332530925568
Raw console output:
https://syzkaller.appspot.com/x/log.txt?id=5472755969425408
Kernel config:
https://syzkaller.appspot.com/x/.config?id=-2760467897697295172
compiler: gcc (GCC) 7.1.1 20170620

IMPORTANT: if you fix the bug, please add the following tag to the commit:
Reported-by: syzbot+b7772c65a1d88bfd8...@syzkaller.appspotmail.com
It will help syzbot understand when the bug is fixed. See footer for
details.
If you forward the report, please keep this part and the footer.

gfs2: fsid=loop0.0: jid=0, already locked for use
gfs2: fsid=loop0.0: jid=0: Looking at journal...
gfs2: fsid=loop0.0: jid=0: Done
gfs2: fsid=loop0.0: first mount done, others may mount
gfs2: fsid=loop0.0: found 1 quota changes
WARNING: CPU: 0 PID: 4469 at ./include/linux/backing-dev.h:341 inode_to_wb
include/linux/backing-dev.h:338 [inline]
WARNING: CPU: 0 PID: 4469 at ./include/linux/backing-dev.h:341
account_page_dirtied+0x8f9/0xcb0 mm/page-writeback.c:2416
Kernel panic - not syncing: panic_on_warn set ...

CPU: 0 PID: 4469 Comm: syzkaller368843 Not tainted 4.16.0-rc7+ #9
Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS
Google 01/01/2011
Call Trace:
   __dump_stack lib/dump_stack.c:17 [inline]
   dump_stack+0x194/0x24d lib/dump_stack.c:53
   panic+0x1e4/0x41c kernel/panic.c:183
   __warn+0x1dc/0x200 kernel/panic.c:547
   report_bug+0x1f4/0x2b0 lib/bug.c:186
   fixup_bug.part.10+0x37/0x80 arch/x86/kernel/traps.c:178
   fixup_bug arch/x86/kernel/traps.c:247 [inline]
   do_error_trap+0x2d7/0x3e0 arch/x86/kernel/traps.c:296
   do_invalid_op+0x1b/0x20 arch/x86/kernel/traps.c:315
   invalid_op+0x1b/0x40 arch/x86/entry/entry_64.S:986
RIP: 0010:inode_to_wb include/linux/backing-dev.h:338 [inline]
RIP: 0010:account_page_dirtied+0x8f9/0xcb0 mm/page-writeback.c:2416
RSP: 0018:8801d966e5c0 EFLAGS: 00010093
RAX: 8801acb7e600 RBX: 11003b2cdcba RCX: 818f47a9
RDX:  RSI: 8801d3338148 RDI: 0082
RBP: 8801d966e698 R08: 11003b2cdc13 R09: 000c
R10: 8801d966e558 R11: 0002 R12: 8801c96f0368
R13: ea0006b12780 R14: 8801c96f01d8 R15: 8801c96f01d8
   __set_page_dirty+0x100/0x4b0 fs/buffer.c:605
   mark_buffer_dirty+0x454/0x5d0 fs/buffer.c:1126

Huh, I don't see how this could possibly happen. The warning is:

  WARN_ON_ONCE(debug_locks &&
   (!lockdep_is_held(>i_lock) &&
!lockdep_is_held(>i_mapping->tree_lock) &&
!lockdep_is_held(>i_wb->list_lock)));

Now __set_page_dirty() which called account_page_dirtied() just did:

spin_lock_irqsave(>tree_lock, flags);

Now the fact is that account_page_dirtied() actually checks
mapping->host->i_mapping->tree_lock so if mapping->host->i_mapping doesn't
get us back to 'mapping', that would explain the warning. But then
something would have to be very wrong in the GFS2 land... Adding some GFS2
related CCs just in case they have some idea.

So I looked at this for some time trying to work out what is going on. I'm
sill not 100% sure now, but lets see if we can figure it out

The stack trace shows a call path to the end of the journal flush code where
we are unpinning pages that have been through the journal. Assuming that
jdata is not in use (it is used for some internal files, even if it is not
selected by the user) then it is most likely that this applies to a metadata
page.

For recent gfs2, all the metadata pages are kept in an address space which
for inodes is in the relevant glock, and for resource groups is a single
address space kept for only that purpose in the super block. In both of
those cases the mapping->host points to the block device inode. Since the
inode's mapping->host reflects only the block device address space (unused
by gfs2) we would not expect it to point back to the relevant address space.

As far as I can tell this usage is ok, since it doesn't make much sense to
require lots of inodes to be hanging around uselessly just to keep metadata
pages in. That after all, is why the address space and inode are separate
structures in the first place since it is not a one to one relationship. So
I think that probably explains why this triggers, since the test is not
really a valid one in all cases,

The problem is we really do expect mapping->host->i_mapping == mapping 

Re: [Cluster-devel] [PATCH v2 0/2] gfs2: Stop using rhashtable_walk_peek

2018-04-04 Thread Herbert Xu
On Wed, Apr 04, 2018 at 11:46:28AM -0400, Bob Peterson wrote:
>
> The patches look good. The big question is whether to add them to this
> merge window while it's still open. Opinions?

We're still hashing out the rhashtable interface so I don't think
now is the time to rush things.

Thanks,
-- 
Email: Herbert Xu 
Home Page: http://gondor.apana.org.au/~herbert/
PGP Key: http://gondor.apana.org.au/~herbert/pubkey.txt



Re: [Cluster-devel] gfs2 vmscan warnings (was: GFS2 Errors)

2018-04-04 Thread Andrew Price
On 19/07/17 10:09, Steven Whitehouse wrote:>> On 2017-07-18 07:25 PM, 
Kristián Feldsam wrote:

Hello, I see today GFS2 errors in log and nothing about that is on net,
so I writing to this mailing list.

node2    19.07.2017 01:11:55    kernel    kern    err    vmscan: 
shrink_slab:

gfs2_glock_shrink_scan+0x0/0x2f0 [gfs2] negative objects to delete
nr=-4549568322848002755
node2    19.07.2017 01:10:56    kernel    kern    err    vmscan: 
shrink_slab:

gfs2_glock_shrink_scan+0x0/0x2f0 [gfs2] negative objects to delete
nr=-8191295421473926116
node2    19.07.2017 01:10:48    kernel    kern    err    vmscan: 
shrink_slab:

gfs2_glock_shrink_scan+0x0/0x2f0 [gfs2] negative objects to delete
nr=-8225402411152149004
node2    19.07.2017 01:10:47    kernel    kern    err    vmscan: 
shrink_slab:

gfs2_glock_shrink_scan+0x0/0x2f0 [gfs2] negative objects to delete
nr=-8230186816585019317
node2    19.07.2017 01:10:45    kernel    kern    err    vmscan: 
shrink_slab:


This looks like a bug 
to me, since the object count should never be negative. The glock 
shrinker is not (yet) zone aware, although the quota shrinker is. Not 
sure if that is related, but perhaps certainly something we'd like 
to investigate further. That said the messages in themselves are 
harmless, but it will likely indicate a less than optimal use of memory. 
If there are any details that can be shared about the use case, and how 
to reproduce that will be very helpful for us to know. Also what kernel 
version was this?


I can reproduce this (by running fsstress for a while, unmounting and 
running fsck.gfs2) with the latest mainline kernel. As far as I can 
tell, gfs2's lru_count is positive until unmount but then gets 
decremented a lot across more than one code path at unmount. It's 
decremented at:


gfs2_glock_remove_from_lru
  clear_glock+0x50/0x60
  glock_hash_walk+0xe1/0xf0
  gfs2_gl_hash_clear+0x40/0x120
  ? gfs2_jindex_free+0x106/0x140
  gfs2_put_super+0x131/0x1d0
  generic_shutdown_super+0x69/0x110
  kill_block_super+0x21/0x50
  deactivate_locked_super+0x39/0x70
  cleanup_mnt+0x3b/0x70
  task_work_run+0x82/0xb0
  exit_to_usermode_loop+0x87/0x90
  do_syscall_64+0x194/0x1a0
  entry_SYSCALL_64_after_hwframe+0x42/0xb7

Workqueue: glock_workqueue glock_work_func
gfs2_glock_remove_from_lru
  __gfs2_glock_put+0x108/0x1c0
  process_one_work+0x20c/0x660
  worker_thread+0x3a/0x390
  ? process_one_work+0x660/0x660
  kthread+0x11c/0x140
  ? kthread_delayed_work_timer_fn+0x90/0x90
  ret_from_fork+0x24/0x30

gfs2_glock_remove_from_lru
  __gfs2_glock_put+0x108/0x1c0
  gfs2_evict_inode+0x2f3/0x650
  ? find_held_lock+0x2d/0x90
  ? evict+0xba/0x190
  ? evict+0xcd/0x190
  evict+0xcd/0x190
  dispose_list+0x51/0x80
  evict_inodes+0x1a5/0x1b0
  generic_shutdown_super+0x3f/0x110
  kill_block_super+0x21/0x50
  deactivate_locked_super+0x39/0x70
  cleanup_mnt+0x3b/0x70
  task_work_run+0x82/0xb0
  exit_to_usermode_loop+0x87/0x90
  do_syscall_64+0x194/0x1a0
  entry_SYSCALL_64_after_hwframe+0x42/0xb7

gfs2_scan_glock_lru
  gfs2_glock_shrink_scan
  shrink_slab.part.44+0x1c6/0x5e0
  shrink_node+0x350/0x360
  kswapd+0x2d9/0x8e0
  ? mem_cgroup_shrink_node+0x310/0x310
  kthread+0x11c/0x140
  ? kthread_delayed_work_timer_fn+0x90/0x90
  ret_from_fork+0x24/0x30

I tried a scripted bisect on it and it threw up 'mm: use sc->priority 
for slab shrink targets' (9092c71bb724dba2ecba849eae69e5c9d39bd3d2) but 
that's fairly recent and Kristián's report was in an older kernel so I'm 
not sure how reliable that is.


My expectations are uncalibrated wrt the lru list at unmount so I'm not 
seeing the root cause at the moment.


Andy



Re: [Cluster-devel] [PATCH v2 0/2] gfs2: Stop using rhashtable_walk_peek

2018-04-04 Thread Bob Peterson
- Original Message -
> Here's a second version of the patch (now a patch set) to eliminate
> rhashtable_walk_peek in gfs2.
> 
> The first patch introduces lockref_put_not_zero, the inverse of
> lockref_get_not_zero.
> 
> The second patch eliminates rhashtable_walk_peek in gfs2.  In
> gfs2_glock_iter_next, the new lockref function from patch one is used to
> drop a lockref count as long as the count doesn't drop to zero.  This is
> almost always the case; if there is a risk of dropping the last
> reference, we must defer that to a work queue because dropping the last
> reference may sleep.
> 
> Thanks,
> Andreas
> 
> Andreas Gruenbacher (2):
>   lockref: Add lockref_put_not_zero
>   gfs2: Stop using rhashtable_walk_peek
> 
>  fs/gfs2/glock.c | 47 ---
>  include/linux/lockref.h |  1 +
>  lib/lockref.c   | 28 
>  3 files changed, 57 insertions(+), 19 deletions(-)
> 
> --
> 2.14.3

Hi,

The patches look good. The big question is whether to add them to this
merge window while it's still open. Opinions?

Acked-by: Bob Peterson 

Regards,

Bob Peterson



Re: [Cluster-devel] gfs2 vmscan warnings (was: GFS2 Errors)

2018-04-04 Thread FeldHost™ Admin
Hello, I switched to Fedora 26 and see again that errors on kernel 4.12.x.

> On 4 Apr 2018, at 17:28, Andrew Price  wrote:
> 
> On 19/07/17 10:09, Steven Whitehouse wrote:>> On 2017-07-18 07:25 PM, 
> Kristián Feldsam wrote:
 Hello, I see today GFS2 errors in log and nothing about that is on net,
 so I writing to this mailing list.
 
 node219.07.2017 01:11:55kernelkernerrvmscan: 
 shrink_slab:
 gfs2_glock_shrink_scan+0x0/0x2f0 [gfs2] negative objects to delete
 nr=-4549568322848002755
 node219.07.2017 01:10:56kernelkernerrvmscan: 
 shrink_slab:
 gfs2_glock_shrink_scan+0x0/0x2f0 [gfs2] negative objects to delete
 nr=-8191295421473926116
 node219.07.2017 01:10:48kernelkernerrvmscan: 
 shrink_slab:
 gfs2_glock_shrink_scan+0x0/0x2f0 [gfs2] negative objects to delete
 nr=-8225402411152149004
 node219.07.2017 01:10:47kernelkernerrvmscan: 
 shrink_slab:
 gfs2_glock_shrink_scan+0x0/0x2f0 [gfs2] negative objects to delete
 nr=-8230186816585019317
 node219.07.2017 01:10:45kernelkernerrvmscan: 
 shrink_slab:
> 
>> This looks like a bug to me, since the object count should never be 
>> negative. The glock shrinker is not (yet) zone aware, although the quota 
>> shrinker is. Not sure if that is related, but perhaps certainly 
>> something we'd like to investigate further. That said the messages in 
>> themselves are harmless, but it will likely indicate a less than optimal use 
>> of memory. If there are any details that can be shared about the use case, 
>> and how to reproduce that will be very helpful for us to know. Also what 
>> kernel version was this?
> 
> I can reproduce this (by running fsstress for a while, unmounting and running 
> fsck.gfs2) with the latest mainline kernel. As far as I can tell, gfs2's 
> lru_count is positive until unmount but then gets decremented a lot across 
> more than one code path at unmount. It's decremented at:
> 
> gfs2_glock_remove_from_lru
>  clear_glock+0x50/0x60
>  glock_hash_walk+0xe1/0xf0
>  gfs2_gl_hash_clear+0x40/0x120
>  ? gfs2_jindex_free+0x106/0x140
>  gfs2_put_super+0x131/0x1d0
>  generic_shutdown_super+0x69/0x110
>  kill_block_super+0x21/0x50
>  deactivate_locked_super+0x39/0x70
>  cleanup_mnt+0x3b/0x70
>  task_work_run+0x82/0xb0
>  exit_to_usermode_loop+0x87/0x90
>  do_syscall_64+0x194/0x1a0
>  entry_SYSCALL_64_after_hwframe+0x42/0xb7
> 
> Workqueue: glock_workqueue glock_work_func
> gfs2_glock_remove_from_lru
>  __gfs2_glock_put+0x108/0x1c0
>  process_one_work+0x20c/0x660
>  worker_thread+0x3a/0x390
>  ? process_one_work+0x660/0x660
>  kthread+0x11c/0x140
>  ? kthread_delayed_work_timer_fn+0x90/0x90
>  ret_from_fork+0x24/0x30
> 
> gfs2_glock_remove_from_lru
>  __gfs2_glock_put+0x108/0x1c0
>  gfs2_evict_inode+0x2f3/0x650
>  ? find_held_lock+0x2d/0x90
>  ? evict+0xba/0x190
>  ? evict+0xcd/0x190
>  evict+0xcd/0x190
>  dispose_list+0x51/0x80
>  evict_inodes+0x1a5/0x1b0
>  generic_shutdown_super+0x3f/0x110
>  kill_block_super+0x21/0x50
>  deactivate_locked_super+0x39/0x70
>  cleanup_mnt+0x3b/0x70
>  task_work_run+0x82/0xb0
>  exit_to_usermode_loop+0x87/0x90
>  do_syscall_64+0x194/0x1a0
>  entry_SYSCALL_64_after_hwframe+0x42/0xb7
> 
> gfs2_scan_glock_lru
>  gfs2_glock_shrink_scan
>  shrink_slab.part.44+0x1c6/0x5e0
>  shrink_node+0x350/0x360
>  kswapd+0x2d9/0x8e0
>  ? mem_cgroup_shrink_node+0x310/0x310
>  kthread+0x11c/0x140
>  ? kthread_delayed_work_timer_fn+0x90/0x90
>  ret_from_fork+0x24/0x30
> 
> I tried a scripted bisect on it and it threw up 'mm: use sc->priority for 
> slab shrink targets' (9092c71bb724dba2ecba849eae69e5c9d39bd3d2) but that's 
> fairly recent and Kristián's report was in an older kernel so I'm not sure 
> how reliable that is.
> 
> My expectations are uncalibrated wrt the lru list at unmount so I'm not 
> seeing the root cause at the moment.
> 
> Andy




Re: [Cluster-devel] [PATCH v2 0/2] gfs2: Stop using rhashtable_walk_peek

2018-04-04 Thread Andreas Grünbacher
Herbert Xu  schrieb am Mi. 4. Apr. 2018 um
17:51:

> On Wed, Apr 04, 2018 at 11:46:28AM -0400, Bob Peterson wrote:
> >
> > The patches look good. The big question is whether to add them to this
> > merge window while it's still open. Opinions?
>
> We're still hashing out the rhashtable interface so I don't think now is
> the time to rush things.


Fair enough. No matter how rhashtable_walk_peek changes, we‘ll still need
these two patches to fix the glock dump though.

Thanks,
Andreas


Re: [Cluster-devel] WARNING in account_page_dirtied

2018-04-04 Thread Steven Whitehouse

Hi,


On 03/04/18 13:05, Jan Kara wrote:

Hello,

On Sun 01-04-18 10:01:02, syzbot wrote:

syzbot hit the following crash on upstream commit
10b84daddbec72c6b440216a69de9a9605127f7a (Sat Mar 31 17:59:00 2018 +)
Merge branch 'perf-urgent-for-linus' of
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
syzbot dashboard link:
https://syzkaller.appspot.com/bug?extid=b7772c65a1d88bfd8fca

C reproducer: https://syzkaller.appspot.com/x/repro.c?id=5705587757154304
syzkaller reproducer:
https://syzkaller.appspot.com/x/repro.syz?id=5644332530925568
Raw console output:
https://syzkaller.appspot.com/x/log.txt?id=5472755969425408
Kernel config:
https://syzkaller.appspot.com/x/.config?id=-2760467897697295172
compiler: gcc (GCC) 7.1.1 20170620

IMPORTANT: if you fix the bug, please add the following tag to the commit:
Reported-by: syzbot+b7772c65a1d88bfd8...@syzkaller.appspotmail.com
It will help syzbot understand when the bug is fixed. See footer for
details.
If you forward the report, please keep this part and the footer.

gfs2: fsid=loop0.0: jid=0, already locked for use
gfs2: fsid=loop0.0: jid=0: Looking at journal...
gfs2: fsid=loop0.0: jid=0: Done
gfs2: fsid=loop0.0: first mount done, others may mount
gfs2: fsid=loop0.0: found 1 quota changes
WARNING: CPU: 0 PID: 4469 at ./include/linux/backing-dev.h:341 inode_to_wb
include/linux/backing-dev.h:338 [inline]
WARNING: CPU: 0 PID: 4469 at ./include/linux/backing-dev.h:341
account_page_dirtied+0x8f9/0xcb0 mm/page-writeback.c:2416
Kernel panic - not syncing: panic_on_warn set ...

CPU: 0 PID: 4469 Comm: syzkaller368843 Not tainted 4.16.0-rc7+ #9
Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS
Google 01/01/2011
Call Trace:
  __dump_stack lib/dump_stack.c:17 [inline]
  dump_stack+0x194/0x24d lib/dump_stack.c:53
  panic+0x1e4/0x41c kernel/panic.c:183
  __warn+0x1dc/0x200 kernel/panic.c:547
  report_bug+0x1f4/0x2b0 lib/bug.c:186
  fixup_bug.part.10+0x37/0x80 arch/x86/kernel/traps.c:178
  fixup_bug arch/x86/kernel/traps.c:247 [inline]
  do_error_trap+0x2d7/0x3e0 arch/x86/kernel/traps.c:296
  do_invalid_op+0x1b/0x20 arch/x86/kernel/traps.c:315
  invalid_op+0x1b/0x40 arch/x86/entry/entry_64.S:986
RIP: 0010:inode_to_wb include/linux/backing-dev.h:338 [inline]
RIP: 0010:account_page_dirtied+0x8f9/0xcb0 mm/page-writeback.c:2416
RSP: 0018:8801d966e5c0 EFLAGS: 00010093
RAX: 8801acb7e600 RBX: 11003b2cdcba RCX: 818f47a9
RDX:  RSI: 8801d3338148 RDI: 0082
RBP: 8801d966e698 R08: 11003b2cdc13 R09: 000c
R10: 8801d966e558 R11: 0002 R12: 8801c96f0368
R13: ea0006b12780 R14: 8801c96f01d8 R15: 8801c96f01d8
  __set_page_dirty+0x100/0x4b0 fs/buffer.c:605
  mark_buffer_dirty+0x454/0x5d0 fs/buffer.c:1126

Huh, I don't see how this could possibly happen. The warning is:

 WARN_ON_ONCE(debug_locks &&
  (!lockdep_is_held(>i_lock) &&
   !lockdep_is_held(>i_mapping->tree_lock) &&
   !lockdep_is_held(>i_wb->list_lock)));

Now __set_page_dirty() which called account_page_dirtied() just did:

spin_lock_irqsave(>tree_lock, flags);

Now the fact is that account_page_dirtied() actually checks
mapping->host->i_mapping->tree_lock so if mapping->host->i_mapping doesn't
get us back to 'mapping', that would explain the warning. But then
something would have to be very wrong in the GFS2 land... Adding some GFS2
related CCs just in case they have some idea.
So I looked at this for some time trying to work out what is going on. 
I'm sill not 100% sure now, but lets see if we can figure it out


The stack trace shows a call path to the end of the journal flush code 
where we are unpinning pages that have been through the journal. 
Assuming that jdata is not in use (it is used for some internal files, 
even if it is not selected by the user) then it is most likely that this 
applies to a metadata page.


For recent gfs2, all the metadata pages are kept in an address space 
which for inodes is in the relevant glock, and for resource groups is a 
single address space kept for only that purpose in the super block. In 
both of those cases the mapping->host points to the block device inode. 
Since the inode's mapping->host reflects only the block device address 
space (unused by gfs2) we would not expect it to point back to the 
relevant address space.


As far as I can tell this usage is ok, since it doesn't make much sense 
to require lots of inodes to be hanging around uselessly just to keep 
metadata pages in. That after all, is why the address space and inode 
are separate structures in the first place since it is not a one to one 
relationship. So I think that probably explains why this triggers, since 
the test is not really a valid one in all cases,


Steve.


  gfs2_unpin+0x143/0x12c0 fs/gfs2/lops.c:108
  buf_lo_after_commit+0x273/0x430 fs/gfs2/lops.c:512
  lops_after_commit fs/gfs2/lops.h:67 [inline]