Re: XFS check crash (WAS Re: [PATCH v11 1/4] kasan: support backing vmalloc space with real shadow memory)

2019-11-29 Thread Daniel Axtens
> 
> Nope, it's vm_map_ram() not being handled
 
 
 Another suspicious one. Related to kasan/vmalloc?
>>> 
>>> Very likely the same as with ion:
>>> 
>>> # git grep vm_map_ram|grep xfs
>>> fs/xfs/xfs_buf.c:* vm_map_ram() will allocate auxiliary 
>>> structures (e.g.
>>> fs/xfs/xfs_buf.c:   bp->b_addr = 
>>> vm_map_ram(bp->b_pages, bp->b_page_count,
>> 
>> Aaargh, that's an embarassing miss.
>> 
>> It's a bit intricate because kasan_vmalloc_populate function is
>> currently set up to take a vm_struct not a vmap_area, but I'll see if I
>> can get something simple out this evening - I'm away for the first part
>> of next week.

For crashes in XFS, binder etc that implicate vm_map_ram, see:
https://lore.kernel.org/linux-mm/20191129154519.30964-1-...@axtens.net/

The easiest way I found to repro the bug is
sudo modprobe i915 mock_selftest=-1

For lock warns, one that goes through the percpu alloc path, the patch
is already queued in mmots.

For Dmitry's latest one where there's an allocation in the
purge_vmap_area_lazy path that triggers a locking warning, you'll have
to wait until next week, sorry.

Regards,
Daniel


XFS check crash (WAS Re: [PATCH v11 1/4] kasan: support backing vmalloc space with real shadow memory)

2019-11-29 Thread Qian Cai



> On Nov 29, 2019, at 7:29 AM, Daniel Axtens  wrote:
> 
 
 Nope, it's vm_map_ram() not being handled
>>> 
>>> 
>>> Another suspicious one. Related to kasan/vmalloc?
>> 
>> Very likely the same as with ion:
>> 
>> # git grep vm_map_ram|grep xfs
>> fs/xfs/xfs_buf.c:* vm_map_ram() will allocate auxiliary 
>> structures (e.g.
>> fs/xfs/xfs_buf.c:   bp->b_addr = vm_map_ram(bp->b_pages, 
>> bp->b_page_count,
> 
> Aaargh, that's an embarassing miss.
> 
> It's a bit intricate because kasan_vmalloc_populate function is
> currently set up to take a vm_struct not a vmap_area, but I'll see if I
> can get something simple out this evening - I'm away for the first part
> of next week.
> 
> Do you have to do anything interesting to get it to explode with xfs? Is
> it as simple as mounting a drive and doing some I/O? Or do you need to
> do something more involved?


I instead trigger something a bit different by manually triggering a crash 
first to make the XFS
partition uncleanly shutdown.

# echo c >/proc/sysrq-trigger

and then reboot the same kernel where it will crash while checking the XFS. 
This can be workaround
by rebooting to an older kernel (v4.18) first where xfs_repair will be 
successfully there, and then rebooting
to the new linux-next kernel will be fine.

[  OK  ] Started File System Check on /dev/mapper/rhel_hpe--sy680gen9--01-root.
 Mounting /sysroot...
[  141.177726][ T1730] SGI XFS with security attributes, no debug enabled
[  141.432382][ T1720] XFS (dm-0): Mounting V5 Filesystem
[**] A start job is running for /sysroot (39s / 1min 51s)[  158.738816][ 
T1720] XFS (dm-0): Starting recovery (logdev: internal)
[  158.792010][  T844] BUG: unable to handle page fault for address: 
f52001fc
[  158.830913][  T844] #PF: supervisor read access in kernel mode
[  158.859680][  T844] #PF: error_code(0x) - not-present page
[  158.886057][  T844] PGD 207ffe3067 P4D 207ffe3067 PUD 2071f2067 PMD 
f68e08067 PTE 0
[  158.922065][  T844] Oops:  [#1] SMP DEBUG_PAGEALLOC KASAN PTI
[  158.949620][  T844] CPU: 112 PID: 844 Comm: kworker/112:1 Not tainted 
5.4.0-next-20191127+ #3
[  158.988759][  T844] Hardware name: HP Synergy 680 Gen9/Synergy 680 Gen9 
Compute Module, BIOS I40 05/23/2018
[  159.033380][  T844] Workqueue: xfs-buf/dm-0 xfs_buf_ioend_work [xfs]
[  159.061935][  T844] RIP: 0010:__asan_load4+0x3a/0xa0
[  159.061941][  T844] Code: 00 00 00 00 00 00 ff 48 39 f8 77 6d 48 8d 47 03 48 
89 c2 83 e2 07 48 83 fa 02 76 30 48 be 00 00 00 00 00 fc ff df 48 c1 e8 03 <0f> 
b6 04 30 84 c0 75 3e 5d c3 48 b8 00 00 00 00 00 80 ff ff eb c7
[  159.061944][  T844] RSP: 0018:c9000a4b7cb0 EFLAGS: 00010a06
[  159.061949][  T844] RAX: 192001fc RBX: c9000f80 RCX: 
c06d10ae
[  159.061952][  T844] RDX: 0003 RSI: dc00 RDI: 
c9000f800060
[  159.061955][  T844] RBP: c9000a4b7cb0 R08: ed130bee89e5 R09: 
0001
[  159.061958][  T844] R10: ed130bee89e4 R11: 88985f744f23 R12: 

[  159.061961][  T844] R13: 889724be0040 R14: 88836c8e5000 R15: 
000c8000
[  159.061965][  T844] FS:  () GS:88985f70() 
knlGS:
[  159.061968][  T844] CS:  0010 DS:  ES:  CR0: 80050033
[  159.061971][  T844] CR2: f52001fc CR3: 001f615b8004 CR4: 
003606e0
[  159.061974][  T844] DR0:  DR1:  DR2: 

[  159.061976][  T844] DR3:  DR6: fffe0ff0 DR7: 
0400
[  159.061978][  T844] Call Trace:
[  159.062118][  T844]  xfs_inode_buf_verify+0x13e/0x230 [xfs]
[  159.062264][  T844]  xfs_inode_buf_readahead_verify+0x13/0x20 [xfs]
[  159.634441][  T844]  xfs_buf_ioend+0x153/0x6b0 [xfs]
[  159.634455][  T844]  ? trace_hardirqs_on+0x3a/0x160
[  159.679087][  T844]  xfs_buf_ioend_work+0x15/0x20 [xfs]
[  159.702689][  T844]  process_one_work+0x579/0xb90
[  159.723898][  T844]  ? pwq_dec_nr_in_flight+0x170/0x170
[  159.747499][  T844]  worker_thread+0x63/0x5b0
[  159.767531][  T844]  ? process_one_work+0xb90/0xb90
[  159.789549][  T844]  kthread+0x1e6/0x210
[  159.807166][  T844]  ? kthread_create_worker_on_cpu+0xc0/0xc0
[  159.833064][  T844]  ret_from_fork+0x3a/0x50
[  159.852200][  T844] Modules linked in: xfs sd_mod bnx2x mdio firmware_class 
hpsa scsi_transport_sas dm_mirror dm_region_hash dm_log dm_mod
[  159.915273][  T844] CR2: f52001fc
[  159.934029][  T844] ---[ end trace 3f3b30f5fc34bbf1 ]---
[  159.957937][  T844] RIP: 0010:__asan_load4+0x3a/0xa0
[  159.980316][  T844] Code: 00 00 00 00 00 00 ff 48 39 f8 77 6d 48 8d 47 03 48 
89 c2 83 e2 07 48 83 fa 02 76 30 48 be 00 00 00 00 00 fc ff df 48 c1 e8 03 <0f> 
b6 04 30 84 c0 75 3e 5d c3 48 b8 00 00 00 00 00 80 ff ff eb c7
[  160.068386][  T844] RSP: 0018:c9000a4b7cb0 EFLAGS: 00010a06
[  160.068389][  T844] RAX: 192001fc RBX: c9000f80 RCX: 
c06d10ae

Re: [PATCH v11 1/4] kasan: support backing vmalloc space with real shadow memory

2019-11-29 Thread Dmitry Vyukov
On Fri, Nov 29, 2019 at 1:29 PM Daniel Axtens  wrote:
> >>> Nope, it's vm_map_ram() not being handled
> >> Another suspicious one. Related to kasan/vmalloc?
> > Very likely the same as with ion:
> >
> > # git grep vm_map_ram|grep xfs
> > fs/xfs/xfs_buf.c:* vm_map_ram() will allocate auxiliary 
> > structures (e.g.
> > fs/xfs/xfs_buf.c:   bp->b_addr = 
> > vm_map_ram(bp->b_pages, bp->b_page_count,
>
> Aaargh, that's an embarassing miss.
>
> It's a bit intricate because kasan_vmalloc_populate function is
> currently set up to take a vm_struct not a vmap_area, but I'll see if I
> can get something simple out this evening - I'm away for the first part
> of next week.
>
> Do you have to do anything interesting to get it to explode with xfs? Is
> it as simple as mounting a drive and doing some I/O? Or do you need to
> do something more involved?

As simple as running syzkaller :)
with this config
https://github.com/google/syzkaller/blob/master/dashboard/config/upstream-kasan.config

> Regards,
> Daniel
>
> >
> >>
> >> BUG: unable to handle page fault for address: f52005b8
> >> #PF: supervisor read access in kernel mode
> >> #PF: error_code(0x) - not-present page
> >> PGD 7ffcd067 P4D 7ffcd067 PUD 2cd10067 PMD 66d76067 PTE 0
> >> Oops:  [#1] PREEMPT SMP KASAN
> >> CPU: 2 PID: 9211 Comm: syz-executor.2 Not tainted 5.4.0-next-20191129+ #6
> >> Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS
> >> rel-1.12.0-59-gc9ba5276e321-prebuilt.qemu.org 04/01/2014
> >> RIP: 0010:xfs_sb_read_verify+0xe9/0x540 fs/xfs/libxfs/xfs_sb.c:691
> >> Code: fc ff df 48 c1 ea 03 80 3c 02 00 0f 85 1e 04 00 00 4d 8b ac 24
> >> 30 01 00 00 48 b8 00 00 00 00 00 fc ff df 4c 89 ea 48 c1 ea 03 <0f> b6
> >> 04 02 84 c0 74 08 3c 03 0f 8e ad 03 00 00 41 8b 45 00 bf 58
> >> RSP: 0018:c9000a58f8d0 EFLAGS: 00010a06
> >> RAX: dc00 RBX: 1920014b1f1d RCX: c9000af42000
> >> RDX: 192005b8 RSI: 82914404 RDI: 88805cdb1460
> >> RBP: c9000a58fab0 R08: 8880610cd380 R09: ed1005a87045
> >> R10: ed1005a87044 R11: 88802d438223 R12: 88805cdb1340
> >> R13: c9002dc0 R14: c9000a58fa88 R15: 888061b5c000
> >> FS:  7fb49bda9700() GS:88802d40() 
> >> knlGS:
> >> CS:  0010 DS:  ES:  CR0: 80050033
> >> CR2: f52005b8 CR3: 60769006 CR4: 00760ee0
> >> DR0:  DR1:  DR2: 
> >> DR3:  DR6: fffe0ff0 DR7: 0400
> >> PKRU: 5554
> >> Call Trace:
> >>  xfs_buf_ioend+0x228/0xdc0 fs/xfs/xfs_buf.c:1162
> >>  __xfs_buf_submit+0x38b/0xe50 fs/xfs/xfs_buf.c:1485
> >>  xfs_buf_submit fs/xfs/xfs_buf.h:268 [inline]
> >>  xfs_buf_read_uncached+0x15c/0x560 fs/xfs/xfs_buf.c:897
> >>  xfs_readsb+0x2d0/0x540 fs/xfs/xfs_mount.c:298
> >>  xfs_fc_fill_super+0x3e6/0x11f0 fs/xfs/xfs_super.c:1415
> >>  get_tree_bdev+0x444/0x620 fs/super.c:1340
> >>  xfs_fc_get_tree+0x1c/0x20 fs/xfs/xfs_super.c:1550
> >>  vfs_get_tree+0x8e/0x300 fs/super.c:1545
> >>  do_new_mount fs/namespace.c:2822 [inline]
> >>  do_mount+0x152d/0x1b50 fs/namespace.c:3142
> >>  ksys_mount+0x114/0x130 fs/namespace.c:3351
> >>  __do_sys_mount fs/namespace.c:3365 [inline]
> >>  __se_sys_mount fs/namespace.c:3362 [inline]
> >>  __x64_sys_mount+0xbe/0x150 fs/namespace.c:3362
> >>  do_syscall_64+0xfa/0x780 arch/x86/entry/common.c:294
> >>  entry_SYSCALL_64_after_hwframe+0x49/0xbe
> >> RIP: 0033:0x46736a
> >> Code: 48 c7 c1 bc ff ff ff f7 d8 64 89 01 48 83 c8 ff c3 66 2e 0f 1f
> >> 84 00 00 00 00 00 0f 1f 44 00 00 49 89 ca b8 a5 00 00 00 0f 05 <48> 3d
> >> 01 f0 ff ff 73 01 c3 48 c7 c1 bc ff ff ff f7 d8 64 89 01 48
> >> RSP: 002b:7fb49bda8a78 EFLAGS: 0202 ORIG_RAX: 00a5
> >> RAX: ffda RBX: 7fb49bda8af0 RCX: 0046736a
> >> RDX: 7fb49bda8ad0 RSI: 2140 RDI: 7fb49bda8af0
> >> RBP: 7fb49bda8ad0 R08: 7fb49bda8b30 R09: 7fb49bda8ad0
> >> R10:  R11: 0202 R12: 7fb49bda8b30
> >> R13: 004b1c60 R14: 004b006d R15: 7fb49bda96bc
> >> Modules linked in:
> >> Dumping ftrace buffer:
> >>(ftrace buffer empty)
> >> CR2: f52005b8
> >> ---[ end trace eddd8949d4c898df ]---
> >> RIP: 0010:xfs_sb_read_verify+0xe9/0x540 fs/xfs/libxfs/xfs_sb.c:691
> >> Code: fc ff df 48 c1 ea 03 80 3c 02 00 0f 85 1e 04 00 00 4d 8b ac 24
> >> 30 01 00 00 48 b8 00 00 00 00 00 fc ff df 4c 89 ea 48 c1 ea 03 <0f> b6
> >> 04 02 84 c0 74 08 3c 03 0f 8e ad 03 00 00 41 8b 45 00 bf 58
> >> RSP: 0018:c9000a58f8d0 EFLAGS: 00010a06
> >> RAX: dc00 RBX: 1920014b1f1d RCX: c9000af42000
> >> RDX: 192005b8 RSI: 82914404 RDI: 88805cdb1460
> >> RBP: c9000a58fab0 R08: 8880610cd380 R09: ed1005a87045
> >> R10: ed1005a87044 R11: 88802d438223 R12: 88805cdb1340
> >> R13: c9002dc0 R14: c9000a58fa88 R15: f

Re: [PATCH v11 1/4] kasan: support backing vmalloc space with real shadow memory

2019-11-29 Thread Daniel Axtens


>>> Nope, it's vm_map_ram() not being handled
>> 
>> 
>> Another suspicious one. Related to kasan/vmalloc?
>
> Very likely the same as with ion:
>
> # git grep vm_map_ram|grep xfs
> fs/xfs/xfs_buf.c:* vm_map_ram() will allocate auxiliary 
> structures (e.g.
> fs/xfs/xfs_buf.c:   bp->b_addr = vm_map_ram(bp->b_pages, 
> bp->b_page_count,

Aaargh, that's an embarassing miss.

It's a bit intricate because kasan_vmalloc_populate function is
currently set up to take a vm_struct not a vmap_area, but I'll see if I
can get something simple out this evening - I'm away for the first part
of next week.

Do you have to do anything interesting to get it to explode with xfs? Is
it as simple as mounting a drive and doing some I/O? Or do you need to
do something more involved?

Regards,
Daniel

>  
>> 
>> BUG: unable to handle page fault for address: f52005b8
>> #PF: supervisor read access in kernel mode
>> #PF: error_code(0x) - not-present page
>> PGD 7ffcd067 P4D 7ffcd067 PUD 2cd10067 PMD 66d76067 PTE 0
>> Oops:  [#1] PREEMPT SMP KASAN
>> CPU: 2 PID: 9211 Comm: syz-executor.2 Not tainted 5.4.0-next-20191129+ #6
>> Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS
>> rel-1.12.0-59-gc9ba5276e321-prebuilt.qemu.org 04/01/2014
>> RIP: 0010:xfs_sb_read_verify+0xe9/0x540 fs/xfs/libxfs/xfs_sb.c:691
>> Code: fc ff df 48 c1 ea 03 80 3c 02 00 0f 85 1e 04 00 00 4d 8b ac 24
>> 30 01 00 00 48 b8 00 00 00 00 00 fc ff df 4c 89 ea 48 c1 ea 03 <0f> b6
>> 04 02 84 c0 74 08 3c 03 0f 8e ad 03 00 00 41 8b 45 00 bf 58
>> RSP: 0018:c9000a58f8d0 EFLAGS: 00010a06
>> RAX: dc00 RBX: 1920014b1f1d RCX: c9000af42000
>> RDX: 192005b8 RSI: 82914404 RDI: 88805cdb1460
>> RBP: c9000a58fab0 R08: 8880610cd380 R09: ed1005a87045
>> R10: ed1005a87044 R11: 88802d438223 R12: 88805cdb1340
>> R13: c9002dc0 R14: c9000a58fa88 R15: 888061b5c000
>> FS:  7fb49bda9700() GS:88802d40() knlGS:
>> CS:  0010 DS:  ES:  CR0: 80050033
>> CR2: f52005b8 CR3: 60769006 CR4: 00760ee0
>> DR0:  DR1:  DR2: 
>> DR3:  DR6: fffe0ff0 DR7: 0400
>> PKRU: 5554
>> Call Trace:
>>  xfs_buf_ioend+0x228/0xdc0 fs/xfs/xfs_buf.c:1162
>>  __xfs_buf_submit+0x38b/0xe50 fs/xfs/xfs_buf.c:1485
>>  xfs_buf_submit fs/xfs/xfs_buf.h:268 [inline]
>>  xfs_buf_read_uncached+0x15c/0x560 fs/xfs/xfs_buf.c:897
>>  xfs_readsb+0x2d0/0x540 fs/xfs/xfs_mount.c:298
>>  xfs_fc_fill_super+0x3e6/0x11f0 fs/xfs/xfs_super.c:1415
>>  get_tree_bdev+0x444/0x620 fs/super.c:1340
>>  xfs_fc_get_tree+0x1c/0x20 fs/xfs/xfs_super.c:1550
>>  vfs_get_tree+0x8e/0x300 fs/super.c:1545
>>  do_new_mount fs/namespace.c:2822 [inline]
>>  do_mount+0x152d/0x1b50 fs/namespace.c:3142
>>  ksys_mount+0x114/0x130 fs/namespace.c:3351
>>  __do_sys_mount fs/namespace.c:3365 [inline]
>>  __se_sys_mount fs/namespace.c:3362 [inline]
>>  __x64_sys_mount+0xbe/0x150 fs/namespace.c:3362
>>  do_syscall_64+0xfa/0x780 arch/x86/entry/common.c:294
>>  entry_SYSCALL_64_after_hwframe+0x49/0xbe
>> RIP: 0033:0x46736a
>> Code: 48 c7 c1 bc ff ff ff f7 d8 64 89 01 48 83 c8 ff c3 66 2e 0f 1f
>> 84 00 00 00 00 00 0f 1f 44 00 00 49 89 ca b8 a5 00 00 00 0f 05 <48> 3d
>> 01 f0 ff ff 73 01 c3 48 c7 c1 bc ff ff ff f7 d8 64 89 01 48
>> RSP: 002b:7fb49bda8a78 EFLAGS: 0202 ORIG_RAX: 00a5
>> RAX: ffda RBX: 7fb49bda8af0 RCX: 0046736a
>> RDX: 7fb49bda8ad0 RSI: 2140 RDI: 7fb49bda8af0
>> RBP: 7fb49bda8ad0 R08: 7fb49bda8b30 R09: 7fb49bda8ad0
>> R10:  R11: 0202 R12: 7fb49bda8b30
>> R13: 004b1c60 R14: 004b006d R15: 7fb49bda96bc
>> Modules linked in:
>> Dumping ftrace buffer:
>>(ftrace buffer empty)
>> CR2: f52005b8
>> ---[ end trace eddd8949d4c898df ]---
>> RIP: 0010:xfs_sb_read_verify+0xe9/0x540 fs/xfs/libxfs/xfs_sb.c:691
>> Code: fc ff df 48 c1 ea 03 80 3c 02 00 0f 85 1e 04 00 00 4d 8b ac 24
>> 30 01 00 00 48 b8 00 00 00 00 00 fc ff df 4c 89 ea 48 c1 ea 03 <0f> b6
>> 04 02 84 c0 74 08 3c 03 0f 8e ad 03 00 00 41 8b 45 00 bf 58
>> RSP: 0018:c9000a58f8d0 EFLAGS: 00010a06
>> RAX: dc00 RBX: 1920014b1f1d RCX: c9000af42000
>> RDX: 192005b8 RSI: 82914404 RDI: 88805cdb1460
>> RBP: c9000a58fab0 R08: 8880610cd380 R09: ed1005a87045
>> R10: ed1005a87044 R11: 88802d438223 R12: 88805cdb1340
>> R13: c9002dc0 R14: c9000a58fa88 R15: 888061b5c000
>> FS:  7fb49bda9700() GS:88802d40() knlGS:
>> CS:  0010 DS:  ES:  CR0: 80050033
>> CR2: f52005b8 CR3: 60769006 CR4: 00760ee0
>> DR0:  DR1:  DR2: 
>> DR3:  DR6: fffe0ff0 DR7: 0400
>> PKRU:

Re: [PATCH v11 1/4] kasan: support backing vmalloc space with real shadow memory

2019-11-29 Thread Daniel Axtens
Hi Dmitry,

>> I am testing this support on next-20191129 and seeing the following warnings:
>>
>> BUG: sleeping function called from invalid context at mm/page_alloc.c:4681
>> in_atomic(): 1, irqs_disabled(): 0, non_block: 0, pid: 44, name: kworker/1:1
>> 4 locks held by kworker/1:1/44:
>>  #0: 888067c26d28 ((wq_completion)events){+.+.}, at:
>> __write_once_size include/linux/compiler.h:247 [inline]
>>  #0: 888067c26d28 ((wq_completion)events){+.+.}, at:
>> arch_atomic64_set arch/x86/include/asm/atomic64_64.h:34 [inline]
>>  #0: 888067c26d28 ((wq_completion)events){+.+.}, at: atomic64_set
>> include/asm-generic/atomic-instrumented.h:868 [inline]
>>  #0: 888067c26d28 ((wq_completion)events){+.+.}, at:
>> atomic_long_set include/asm-generic/atomic-long.h:40 [inline]
>>  #0: 888067c26d28 ((wq_completion)events){+.+.}, at: set_work_data
>> kernel/workqueue.c:615 [inline]
>>  #0: 888067c26d28 ((wq_completion)events){+.+.}, at:
>> set_work_pool_and_clear_pending kernel/workqueue.c:642 [inline]
>>  #0: 888067c26d28 ((wq_completion)events){+.+.}, at:
>> process_one_work+0x88b/0x1750 kernel/workqueue.c:2235
>>  #1: c92afdf0 (pcpu_balance_work){+.+.}, at:
>> process_one_work+0x8c0/0x1750 kernel/workqueue.c:2239
>>  #2: 8943f080 (pcpu_alloc_mutex){+.+.}, at:
>> pcpu_balance_workfn+0xcc/0x13e0 mm/percpu.c:1845
>>  #3: 89450c78 (vmap_area_lock){+.+.}, at: spin_lock
>> include/linux/spinlock.h:338 [inline]
>>  #3: 89450c78 (vmap_area_lock){+.+.}, at:
>> pcpu_get_vm_areas+0x1449/0x3df0 mm/vmalloc.c:3431
>> Preemption disabled at:
>> [] spin_lock include/linux/spinlock.h:338 [inline]
>> [] pcpu_get_vm_areas+0x1449/0x3df0 mm/vmalloc.c:3431
>> CPU: 1 PID: 44 Comm: kworker/1:1 Not tainted 5.4.0-next-20191129+ #5
>> Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS 1.12.0-1 04/01/2014
>> Workqueue: events pcpu_balance_workfn
>> Call Trace:
>>  __dump_stack lib/dump_stack.c:77 [inline]
>>  dump_stack+0x199/0x216 lib/dump_stack.c:118
>>  ___might_sleep.cold.97+0x1f5/0x238 kernel/sched/core.c:6800
>>  __might_sleep+0x95/0x190 kernel/sched/core.c:6753
>>  prepare_alloc_pages mm/page_alloc.c:4681 [inline]
>>  __alloc_pages_nodemask+0x3cd/0x890 mm/page_alloc.c:4730
>>  alloc_pages_current+0x10c/0x210 mm/mempolicy.c:2211
>>  alloc_pages include/linux/gfp.h:532 [inline]
>>  __get_free_pages+0xc/0x40 mm/page_alloc.c:4786
>>  kasan_populate_vmalloc_pte mm/kasan/common.c:762 [inline]
>>  kasan_populate_vmalloc_pte+0x2f/0x1b0 mm/kasan/common.c:753
>>  apply_to_pte_range mm/memory.c:2041 [inline]
>>  apply_to_pmd_range mm/memory.c:2068 [inline]
>>  apply_to_pud_range mm/memory.c:2088 [inline]
>>  apply_to_p4d_range mm/memory.c:2108 [inline]
>>  apply_to_page_range+0x5ca/0xa00 mm/memory.c:2133
>>  kasan_populate_vmalloc+0x69/0xa0 mm/kasan/common.c:791
>>  pcpu_get_vm_areas+0x1596/0x3df0 mm/vmalloc.c:3439
>>  pcpu_create_chunk+0x240/0x7f0 mm/percpu-vm.c:340
>>  pcpu_balance_workfn+0x1033/0x13e0 mm/percpu.c:1934
>>  process_one_work+0x9b5/0x1750 kernel/workqueue.c:2264
>>  worker_thread+0x8b/0xd20 kernel/workqueue.c:2410
>>  kthread+0x365/0x450 kernel/kthread.c:255
>>  ret_from_fork+0x24/0x30 arch/x86/entry/entry_64.S:352
>>
>>
>> Not sure if it's the same or not. Is it addressed by something in flight?

It looks like this one is the same.

There is a patch to fix it:
https://lore.kernel.org/linux-mm/20191120052719.7201-1-...@axtens.net/

Andrew said he had picked it up on the 22nd:
https://marc.info/?l=linux-mm-commits&m=157438241512561&w=2
It's landed in mmots but not mmotm, so hopefully that will happen and
then it will land in -next very soon!

I will look into your other bug report shortly.

Regards,
Daniel

>>
>> My config:
>> https://gist.githubusercontent.com/dvyukov/36c7be311fdec9cd51c649f7c3cb2ddb/raw/39c6f864fdd0ffc53f0822b14c354a73c1695fa1/gistfile1.txt
>
>
> I've tried this fix for pcpu_get_vm_areas:
> https://groups.google.com/d/msg/kasan-dev/t_F2X1MWKwk/h152Z3q2AgAJ
> and it helps. But this will break syzbot on linux-next soon.


Re: [PATCH v11 1/4] kasan: support backing vmalloc space with real shadow memory

2019-11-29 Thread Andrey Ryabinin



On 11/29/19 2:47 PM, Dmitry Vyukov wrote:
> On Fri, Nov 29, 2019 at 12:38 PM Andrey Ryabinin
>  wrote:
>
>
> Not sure if it's the same or not. Is it addressed by something in flight?
>
> My config:
> https://gist.githubusercontent.com/dvyukov/36c7be311fdec9cd51c649f7c3cb2ddb/raw/39c6f864fdd0ffc53f0822b14c354a73c1695fa1/gistfile1.txt


 I've tried this fix for pcpu_get_vm_areas:
 https://groups.google.com/d/msg/kasan-dev/t_F2X1MWKwk/h152Z3q2AgAJ
 and it helps. But this will break syzbot on linux-next soon.
>>>
>>>
>>> Can this be related as well?
>>> Crashes on accesses to shadow on the ion memory...
>>
>> Nope, it's vm_map_ram() not being handled
> 
> 
> Another suspicious one. Related to kasan/vmalloc?

Very likely the same as with ion:

# git grep vm_map_ram|grep xfs
fs/xfs/xfs_buf.c:* vm_map_ram() will allocate auxiliary 
structures (e.g.
fs/xfs/xfs_buf.c:   bp->b_addr = vm_map_ram(bp->b_pages, 
bp->b_page_count,
 
> 
> BUG: unable to handle page fault for address: f52005b8
> #PF: supervisor read access in kernel mode
> #PF: error_code(0x) - not-present page
> PGD 7ffcd067 P4D 7ffcd067 PUD 2cd10067 PMD 66d76067 PTE 0
> Oops:  [#1] PREEMPT SMP KASAN
> CPU: 2 PID: 9211 Comm: syz-executor.2 Not tainted 5.4.0-next-20191129+ #6
> Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS
> rel-1.12.0-59-gc9ba5276e321-prebuilt.qemu.org 04/01/2014
> RIP: 0010:xfs_sb_read_verify+0xe9/0x540 fs/xfs/libxfs/xfs_sb.c:691
> Code: fc ff df 48 c1 ea 03 80 3c 02 00 0f 85 1e 04 00 00 4d 8b ac 24
> 30 01 00 00 48 b8 00 00 00 00 00 fc ff df 4c 89 ea 48 c1 ea 03 <0f> b6
> 04 02 84 c0 74 08 3c 03 0f 8e ad 03 00 00 41 8b 45 00 bf 58
> RSP: 0018:c9000a58f8d0 EFLAGS: 00010a06
> RAX: dc00 RBX: 1920014b1f1d RCX: c9000af42000
> RDX: 192005b8 RSI: 82914404 RDI: 88805cdb1460
> RBP: c9000a58fab0 R08: 8880610cd380 R09: ed1005a87045
> R10: ed1005a87044 R11: 88802d438223 R12: 88805cdb1340
> R13: c9002dc0 R14: c9000a58fa88 R15: 888061b5c000
> FS:  7fb49bda9700() GS:88802d40() knlGS:
> CS:  0010 DS:  ES:  CR0: 80050033
> CR2: f52005b8 CR3: 60769006 CR4: 00760ee0
> DR0:  DR1:  DR2: 
> DR3:  DR6: fffe0ff0 DR7: 0400
> PKRU: 5554
> Call Trace:
>  xfs_buf_ioend+0x228/0xdc0 fs/xfs/xfs_buf.c:1162
>  __xfs_buf_submit+0x38b/0xe50 fs/xfs/xfs_buf.c:1485
>  xfs_buf_submit fs/xfs/xfs_buf.h:268 [inline]
>  xfs_buf_read_uncached+0x15c/0x560 fs/xfs/xfs_buf.c:897
>  xfs_readsb+0x2d0/0x540 fs/xfs/xfs_mount.c:298
>  xfs_fc_fill_super+0x3e6/0x11f0 fs/xfs/xfs_super.c:1415
>  get_tree_bdev+0x444/0x620 fs/super.c:1340
>  xfs_fc_get_tree+0x1c/0x20 fs/xfs/xfs_super.c:1550
>  vfs_get_tree+0x8e/0x300 fs/super.c:1545
>  do_new_mount fs/namespace.c:2822 [inline]
>  do_mount+0x152d/0x1b50 fs/namespace.c:3142
>  ksys_mount+0x114/0x130 fs/namespace.c:3351
>  __do_sys_mount fs/namespace.c:3365 [inline]
>  __se_sys_mount fs/namespace.c:3362 [inline]
>  __x64_sys_mount+0xbe/0x150 fs/namespace.c:3362
>  do_syscall_64+0xfa/0x780 arch/x86/entry/common.c:294
>  entry_SYSCALL_64_after_hwframe+0x49/0xbe
> RIP: 0033:0x46736a
> Code: 48 c7 c1 bc ff ff ff f7 d8 64 89 01 48 83 c8 ff c3 66 2e 0f 1f
> 84 00 00 00 00 00 0f 1f 44 00 00 49 89 ca b8 a5 00 00 00 0f 05 <48> 3d
> 01 f0 ff ff 73 01 c3 48 c7 c1 bc ff ff ff f7 d8 64 89 01 48
> RSP: 002b:7fb49bda8a78 EFLAGS: 0202 ORIG_RAX: 00a5
> RAX: ffda RBX: 7fb49bda8af0 RCX: 0046736a
> RDX: 7fb49bda8ad0 RSI: 2140 RDI: 7fb49bda8af0
> RBP: 7fb49bda8ad0 R08: 7fb49bda8b30 R09: 7fb49bda8ad0
> R10:  R11: 0202 R12: 7fb49bda8b30
> R13: 004b1c60 R14: 004b006d R15: 7fb49bda96bc
> Modules linked in:
> Dumping ftrace buffer:
>(ftrace buffer empty)
> CR2: f52005b8
> ---[ end trace eddd8949d4c898df ]---
> RIP: 0010:xfs_sb_read_verify+0xe9/0x540 fs/xfs/libxfs/xfs_sb.c:691
> Code: fc ff df 48 c1 ea 03 80 3c 02 00 0f 85 1e 04 00 00 4d 8b ac 24
> 30 01 00 00 48 b8 00 00 00 00 00 fc ff df 4c 89 ea 48 c1 ea 03 <0f> b6
> 04 02 84 c0 74 08 3c 03 0f 8e ad 03 00 00 41 8b 45 00 bf 58
> RSP: 0018:c9000a58f8d0 EFLAGS: 00010a06
> RAX: dc00 RBX: 1920014b1f1d RCX: c9000af42000
> RDX: 192005b8 RSI: 82914404 RDI: 88805cdb1460
> RBP: c9000a58fab0 R08: 8880610cd380 R09: ed1005a87045
> R10: ed1005a87044 R11: 88802d438223 R12: 88805cdb1340
> R13: c9002dc0 R14: c9000a58fa88 R15: 888061b5c000
> FS:  7fb49bda9700() GS:88802d40() knlGS:
> CS:  0010 DS:  ES:  CR0: 80050033
> CR2: f52005b8 CR3: 60769006 CR4: 00760ee0
> DR0:  DR1: 

Re: [PATCH v11 1/4] kasan: support backing vmalloc space with real shadow memory

2019-11-29 Thread Andrey Ryabinin



On 11/29/19 2:02 PM, Dmitry Vyukov wrote:
> On Fri, Nov 29, 2019 at 11:58 AM Dmitry Vyukov  wrote:
>>
>> On Fri, Nov 29, 2019 at 11:43 AM Dmitry Vyukov  wrote:
>>>
>>> On Tue, Nov 19, 2019 at 10:54 AM Andrey Ryabinin
>>>  wrote:
 On 11/18/19 6:29 AM, Daniel Axtens wrote:
> Qian Cai  writes:
>
>> On Thu, 2019-10-31 at 20:39 +1100, Daniel Axtens wrote:
>>> /*
>>>  * In this function, newly allocated vm_struct has VM_UNINITIALIZED
>>>  * flag. It means that vm_struct is not fully initialized.
>>> @@ -3377,6 +3411,9 @@ struct vm_struct **pcpu_get_vm_areas(const 
>>> unsigned long *offsets,
>>>
>>> setup_vmalloc_vm_locked(vms[area], vas[area], VM_ALLOC,
>>>  pcpu_get_vm_areas);
>>> +
>>> +   /* assume success here */
>>> +   kasan_populate_vmalloc(sizes[area], vms[area]);
>>> }
>>> spin_unlock(&vmap_area_lock);
>>
>> Here it is all wrong. GFP_KERNEL with in_atomic().
>
> I think this fix will work, I will do a v12 with it included.

 You can send just the fix. Andrew will fold it into the original patch 
 before sending it to Linus.



> diff --git a/mm/vmalloc.c b/mm/vmalloc.c
> index a4b950a02d0b..bf030516258c 100644
> --- a/mm/vmalloc.c
> +++ b/mm/vmalloc.c
> @@ -3417,11 +3417,14 @@ struct vm_struct **pcpu_get_vm_areas(const 
> unsigned long *offsets,
>
> setup_vmalloc_vm_locked(vms[area], vas[area], VM_ALLOC,
>  pcpu_get_vm_areas);
> +   }
> +   spin_unlock(&vmap_area_lock);
>
> +   /* populate the shadow space outside of the lock */
> +   for (area = 0; area < nr_vms; area++) {
> /* assume success here */
> kasan_populate_vmalloc(sizes[area], vms[area]);
> }
> -   spin_unlock(&vmap_area_lock);
>
> kfree(vas);
> return vms;
>>>
>>> Hi,
>>>
>>> I am testing this support on next-20191129 and seeing the following 
>>> warnings:
>>>
>>> BUG: sleeping function called from invalid context at mm/page_alloc.c:4681
>>> in_atomic(): 1, irqs_disabled(): 0, non_block: 0, pid: 44, name: kworker/1:1
>>> 4 locks held by kworker/1:1/44:
>>>  #0: 888067c26d28 ((wq_completion)events){+.+.}, at:
>>> __write_once_size include/linux/compiler.h:247 [inline]
>>>  #0: 888067c26d28 ((wq_completion)events){+.+.}, at:
>>> arch_atomic64_set arch/x86/include/asm/atomic64_64.h:34 [inline]
>>>  #0: 888067c26d28 ((wq_completion)events){+.+.}, at: atomic64_set
>>> include/asm-generic/atomic-instrumented.h:868 [inline]
>>>  #0: 888067c26d28 ((wq_completion)events){+.+.}, at:
>>> atomic_long_set include/asm-generic/atomic-long.h:40 [inline]
>>>  #0: 888067c26d28 ((wq_completion)events){+.+.}, at: set_work_data
>>> kernel/workqueue.c:615 [inline]
>>>  #0: 888067c26d28 ((wq_completion)events){+.+.}, at:
>>> set_work_pool_and_clear_pending kernel/workqueue.c:642 [inline]
>>>  #0: 888067c26d28 ((wq_completion)events){+.+.}, at:
>>> process_one_work+0x88b/0x1750 kernel/workqueue.c:2235
>>>  #1: c92afdf0 (pcpu_balance_work){+.+.}, at:
>>> process_one_work+0x8c0/0x1750 kernel/workqueue.c:2239
>>>  #2: 8943f080 (pcpu_alloc_mutex){+.+.}, at:
>>> pcpu_balance_workfn+0xcc/0x13e0 mm/percpu.c:1845
>>>  #3: 89450c78 (vmap_area_lock){+.+.}, at: spin_lock
>>> include/linux/spinlock.h:338 [inline]
>>>  #3: 89450c78 (vmap_area_lock){+.+.}, at:
>>> pcpu_get_vm_areas+0x1449/0x3df0 mm/vmalloc.c:3431
>>> Preemption disabled at:
>>> [] spin_lock include/linux/spinlock.h:338 [inline]
>>> [] pcpu_get_vm_areas+0x1449/0x3df0 mm/vmalloc.c:3431
>>> CPU: 1 PID: 44 Comm: kworker/1:1 Not tainted 5.4.0-next-20191129+ #5
>>> Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS 1.12.0-1 04/01/2014
>>> Workqueue: events pcpu_balance_workfn
>>> Call Trace:
>>>  __dump_stack lib/dump_stack.c:77 [inline]
>>>  dump_stack+0x199/0x216 lib/dump_stack.c:118
>>>  ___might_sleep.cold.97+0x1f5/0x238 kernel/sched/core.c:6800
>>>  __might_sleep+0x95/0x190 kernel/sched/core.c:6753
>>>  prepare_alloc_pages mm/page_alloc.c:4681 [inline]
>>>  __alloc_pages_nodemask+0x3cd/0x890 mm/page_alloc.c:4730
>>>  alloc_pages_current+0x10c/0x210 mm/mempolicy.c:2211
>>>  alloc_pages include/linux/gfp.h:532 [inline]
>>>  __get_free_pages+0xc/0x40 mm/page_alloc.c:4786
>>>  kasan_populate_vmalloc_pte mm/kasan/common.c:762 [inline]
>>>  kasan_populate_vmalloc_pte+0x2f/0x1b0 mm/kasan/common.c:753
>>>  apply_to_pte_range mm/memory.c:2041 [inline]
>>>  apply_to_pmd_range mm/memory.c:2068 [inline]
>>>  apply_to_pud_range mm/memory.c:2088 [inline]
>>>  apply_to_p4d_range mm/memory.c:2108 [inline]
>>>  apply_to_page_range+0x5ca/0xa00 mm/memory.c:2133
>>>  kasan_populate_vmalloc+0x69/0xa0 mm/kasan/common.c:791
>>>  pcpu_get_vm_areas+0x1596/0x3df0 mm/vmalloc.c:3439
>>

Re: [PATCH v11 1/4] kasan: support backing vmalloc space with real shadow memory

2019-11-29 Thread Dmitry Vyukov
On Fri, Nov 29, 2019 at 11:43 AM Dmitry Vyukov  wrote:
>
> On Tue, Nov 19, 2019 at 10:54 AM Andrey Ryabinin
>  wrote:
> > On 11/18/19 6:29 AM, Daniel Axtens wrote:
> > > Qian Cai  writes:
> > >
> > >> On Thu, 2019-10-31 at 20:39 +1100, Daniel Axtens wrote:
> > >>> /*
> > >>>  * In this function, newly allocated vm_struct has VM_UNINITIALIZED
> > >>>  * flag. It means that vm_struct is not fully initialized.
> > >>> @@ -3377,6 +3411,9 @@ struct vm_struct **pcpu_get_vm_areas(const 
> > >>> unsigned long *offsets,
> > >>>
> > >>> setup_vmalloc_vm_locked(vms[area], vas[area], VM_ALLOC,
> > >>>  pcpu_get_vm_areas);
> > >>> +
> > >>> +   /* assume success here */
> > >>> +   kasan_populate_vmalloc(sizes[area], vms[area]);
> > >>> }
> > >>> spin_unlock(&vmap_area_lock);
> > >>
> > >> Here it is all wrong. GFP_KERNEL with in_atomic().
> > >
> > > I think this fix will work, I will do a v12 with it included.
> >
> > You can send just the fix. Andrew will fold it into the original patch 
> > before sending it to Linus.
> >
> >
> >
> > > diff --git a/mm/vmalloc.c b/mm/vmalloc.c
> > > index a4b950a02d0b..bf030516258c 100644
> > > --- a/mm/vmalloc.c
> > > +++ b/mm/vmalloc.c
> > > @@ -3417,11 +3417,14 @@ struct vm_struct **pcpu_get_vm_areas(const 
> > > unsigned long *offsets,
> > >
> > > setup_vmalloc_vm_locked(vms[area], vas[area], VM_ALLOC,
> > >  pcpu_get_vm_areas);
> > > +   }
> > > +   spin_unlock(&vmap_area_lock);
> > >
> > > +   /* populate the shadow space outside of the lock */
> > > +   for (area = 0; area < nr_vms; area++) {
> > > /* assume success here */
> > > kasan_populate_vmalloc(sizes[area], vms[area]);
> > > }
> > > -   spin_unlock(&vmap_area_lock);
> > >
> > > kfree(vas);
> > > return vms;
>
> Hi,
>
> I am testing this support on next-20191129 and seeing the following warnings:
>
> BUG: sleeping function called from invalid context at mm/page_alloc.c:4681
> in_atomic(): 1, irqs_disabled(): 0, non_block: 0, pid: 44, name: kworker/1:1
> 4 locks held by kworker/1:1/44:
>  #0: 888067c26d28 ((wq_completion)events){+.+.}, at:
> __write_once_size include/linux/compiler.h:247 [inline]
>  #0: 888067c26d28 ((wq_completion)events){+.+.}, at:
> arch_atomic64_set arch/x86/include/asm/atomic64_64.h:34 [inline]
>  #0: 888067c26d28 ((wq_completion)events){+.+.}, at: atomic64_set
> include/asm-generic/atomic-instrumented.h:868 [inline]
>  #0: 888067c26d28 ((wq_completion)events){+.+.}, at:
> atomic_long_set include/asm-generic/atomic-long.h:40 [inline]
>  #0: 888067c26d28 ((wq_completion)events){+.+.}, at: set_work_data
> kernel/workqueue.c:615 [inline]
>  #0: 888067c26d28 ((wq_completion)events){+.+.}, at:
> set_work_pool_and_clear_pending kernel/workqueue.c:642 [inline]
>  #0: 888067c26d28 ((wq_completion)events){+.+.}, at:
> process_one_work+0x88b/0x1750 kernel/workqueue.c:2235
>  #1: c92afdf0 (pcpu_balance_work){+.+.}, at:
> process_one_work+0x8c0/0x1750 kernel/workqueue.c:2239
>  #2: 8943f080 (pcpu_alloc_mutex){+.+.}, at:
> pcpu_balance_workfn+0xcc/0x13e0 mm/percpu.c:1845
>  #3: 89450c78 (vmap_area_lock){+.+.}, at: spin_lock
> include/linux/spinlock.h:338 [inline]
>  #3: 89450c78 (vmap_area_lock){+.+.}, at:
> pcpu_get_vm_areas+0x1449/0x3df0 mm/vmalloc.c:3431
> Preemption disabled at:
> [] spin_lock include/linux/spinlock.h:338 [inline]
> [] pcpu_get_vm_areas+0x1449/0x3df0 mm/vmalloc.c:3431
> CPU: 1 PID: 44 Comm: kworker/1:1 Not tainted 5.4.0-next-20191129+ #5
> Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS 1.12.0-1 04/01/2014
> Workqueue: events pcpu_balance_workfn
> Call Trace:
>  __dump_stack lib/dump_stack.c:77 [inline]
>  dump_stack+0x199/0x216 lib/dump_stack.c:118
>  ___might_sleep.cold.97+0x1f5/0x238 kernel/sched/core.c:6800
>  __might_sleep+0x95/0x190 kernel/sched/core.c:6753
>  prepare_alloc_pages mm/page_alloc.c:4681 [inline]
>  __alloc_pages_nodemask+0x3cd/0x890 mm/page_alloc.c:4730
>  alloc_pages_current+0x10c/0x210 mm/mempolicy.c:2211
>  alloc_pages include/linux/gfp.h:532 [inline]
>  __get_free_pages+0xc/0x40 mm/page_alloc.c:4786
>  kasan_populate_vmalloc_pte mm/kasan/common.c:762 [inline]
>  kasan_populate_vmalloc_pte+0x2f/0x1b0 mm/kasan/common.c:753
>  apply_to_pte_range mm/memory.c:2041 [inline]
>  apply_to_pmd_range mm/memory.c:2068 [inline]
>  apply_to_pud_range mm/memory.c:2088 [inline]
>  apply_to_p4d_range mm/memory.c:2108 [inline]
>  apply_to_page_range+0x5ca/0xa00 mm/memory.c:2133
>  kasan_populate_vmalloc+0x69/0xa0 mm/kasan/common.c:791
>  pcpu_get_vm_areas+0x1596/0x3df0 mm/vmalloc.c:3439
>  pcpu_create_chunk+0x240/0x7f0 mm/percpu-vm.c:340
>  pcpu_balance_workfn+0x1033/0x13e0 mm/percpu.c:1934
>  process_one_work+0x9b5/0x1750 kernel/workqueue.c:2264
>  worker_thread+0x8b/0xd20 kernel/workqueue.c:2410
>  kthread+0x365/0x450 

Re: [PATCH v11 1/4] kasan: support backing vmalloc space with real shadow memory

2019-11-19 Thread Andrey Ryabinin



On 11/18/19 6:29 AM, Daniel Axtens wrote:
> Qian Cai  writes:
> 
>> On Thu, 2019-10-31 at 20:39 +1100, Daniel Axtens wrote:
>>> /*
>>>  * In this function, newly allocated vm_struct has VM_UNINITIALIZED
>>>  * flag. It means that vm_struct is not fully initialized.
>>> @@ -3377,6 +3411,9 @@ struct vm_struct **pcpu_get_vm_areas(const unsigned 
>>> long *offsets,
>>>  
>>> setup_vmalloc_vm_locked(vms[area], vas[area], VM_ALLOC,
>>>  pcpu_get_vm_areas);
>>> +
>>> +   /* assume success here */
>>> +   kasan_populate_vmalloc(sizes[area], vms[area]);
>>> }
>>> spin_unlock(&vmap_area_lock);
>>
>> Here it is all wrong. GFP_KERNEL with in_atomic().
> 
> I think this fix will work, I will do a v12 with it included.
 
You can send just the fix. Andrew will fold it into the original patch before 
sending it to Linus.



> diff --git a/mm/vmalloc.c b/mm/vmalloc.c
> index a4b950a02d0b..bf030516258c 100644
> --- a/mm/vmalloc.c
> +++ b/mm/vmalloc.c
> @@ -3417,11 +3417,14 @@ struct vm_struct **pcpu_get_vm_areas(const unsigned 
> long *offsets,
>  
> setup_vmalloc_vm_locked(vms[area], vas[area], VM_ALLOC,
>  pcpu_get_vm_areas);
> +   }
> +   spin_unlock(&vmap_area_lock);
>  
> +   /* populate the shadow space outside of the lock */
> +   for (area = 0; area < nr_vms; area++) {
> /* assume success here */
> kasan_populate_vmalloc(sizes[area], vms[area]);
> }
> -   spin_unlock(&vmap_area_lock);
>  
> kfree(vas);
> return vms;
> 
> 


Re: [PATCH v11 1/4] kasan: support backing vmalloc space with real shadow memory

2019-11-17 Thread Daniel Axtens
Qian Cai  writes:

> On Thu, 2019-10-31 at 20:39 +1100, Daniel Axtens wrote:
>>  /*
>>   * In this function, newly allocated vm_struct has VM_UNINITIALIZED
>>   * flag. It means that vm_struct is not fully initialized.
>> @@ -3377,6 +3411,9 @@ struct vm_struct **pcpu_get_vm_areas(const unsigned 
>> long *offsets,
>>  
>>  setup_vmalloc_vm_locked(vms[area], vas[area], VM_ALLOC,
>>   pcpu_get_vm_areas);
>> +
>> +/* assume success here */
>> +kasan_populate_vmalloc(sizes[area], vms[area]);
>>  }
>>  spin_unlock(&vmap_area_lock);
>
> Here it is all wrong. GFP_KERNEL with in_atomic().

I think this fix will work, I will do a v12 with it included.

diff --git a/mm/vmalloc.c b/mm/vmalloc.c
index a4b950a02d0b..bf030516258c 100644
--- a/mm/vmalloc.c
+++ b/mm/vmalloc.c
@@ -3417,11 +3417,14 @@ struct vm_struct **pcpu_get_vm_areas(const unsigned 
long *offsets,
 
setup_vmalloc_vm_locked(vms[area], vas[area], VM_ALLOC,
 pcpu_get_vm_areas);
+   }
+   spin_unlock(&vmap_area_lock);
 
+   /* populate the shadow space outside of the lock */
+   for (area = 0; area < nr_vms; area++) {
/* assume success here */
kasan_populate_vmalloc(sizes[area], vms[area]);
}
-   spin_unlock(&vmap_area_lock);
 
kfree(vas);
return vms;




Re: [PATCH v11 1/4] kasan: support backing vmalloc space with real shadow memory

2019-11-15 Thread Qian Cai
On Thu, 2019-10-31 at 20:39 +1100, Daniel Axtens wrote:
>   /*
>* In this function, newly allocated vm_struct has VM_UNINITIALIZED
>* flag. It means that vm_struct is not fully initialized.
> @@ -3377,6 +3411,9 @@ struct vm_struct **pcpu_get_vm_areas(const unsigned 
> long *offsets,
>  
>   setup_vmalloc_vm_locked(vms[area], vas[area], VM_ALLOC,
>pcpu_get_vm_areas);
> +
> + /* assume success here */
> + kasan_populate_vmalloc(sizes[area], vms[area]);
>   }
>   spin_unlock(&vmap_area_lock);

Here it is all wrong. GFP_KERNEL with in_atomic().

[   32.231000][T1] BUG: sleeping function called from invalid context at
mm/page_alloc.c:4681
[   32.239934][T1] in_atomic(): 1, irqs_disabled(): 0, non_block: 0, pid: 1,
name: swapper/0
[   32.248896][T1] 2 locks held by swapper/0/1:
[   32.253580][T1]  #0: 880d6160 (pcpu_alloc_mutex){+.+.}, at:
pcpu_alloc+0x707/0xbe0
[   32.262305][T1]  #1: 88105558 (vmap_area_lock){+.+.}, at:
pcpu_get_vm_areas+0xc4f/0x1e60
[   32.271919][T1] CPU: 4 PID: 1 Comm: swapper/0 Tainted:
GW 5.4.0-rc7-next-20191115+ #6
[   32.281555][T1] Hardware name: HPE ProLiant DL385 Gen10/ProLiant DL385
Gen10, BIOS A40 03/09/2018
[   32.281896][T1] Call Trace:
[   32.281896][T1]  dump_stack+0xa0/0xea
[   32.281896][T1]  ___might_sleep.cold.89+0xd2/0x122
[   32.301996][T1]  __might_sleep+0x73/0xe0
[   32.301996][T1]  __alloc_pages_nodemask+0x442/0x720
[   32.311564][T1]  ? __kasan_check_read+0x11/0x20
[   32.311564][T1]  ? __alloc_pages_slowpath+0x1870/0x1870
[   32.321705][T1]  ? mark_held_locks+0x86/0xb0
[   32.321705][T1]  ? _raw_spin_unlock_irqrestore+0x44/0x50
[   32.331563][T1]  alloc_page_interleave+0x18/0x130
[   32.331563][T1]  alloc_pages_current+0xf6/0x110
[   32.341979][T1]  __get_free_pages+0x12/0x60
[   32.341979][T1]  __pte_alloc_kernel+0x1b/0xc0
[   32.351563][T1]  apply_to_page_range+0x5b5/0x690
[   32.351563][T1]  ? memset+0x40/0x40
[   32.361693][T1]  kasan_populate_vmalloc+0x6d/0xa0
[   32.361693][T1]  pcpu_get_vm_areas+0xd49/0x1e60
[   32.371425][T1]  ? vm_map_ram+0x10d0/0x10d0
[   32.371425][T1]  ? pcpu_mem_zalloc+0x65/0x90
[   32.371425][T1]  pcpu_create_chunk+0x152/0x3f0
[   32.371425][T1]  pcpu_alloc+0xa2f/0xbe0
[   32.391423][T1]  ? pcpu_balance_workfn+0xb00/0xb00
[   32.391423][T1]  ? __kasan_kmalloc.constprop.11+0xc1/0xd0
[   32.391423][T1]  ? kasan_kmalloc+0x9/0x10
[   32.391423][T1]  ? kmem_cache_alloc_trace+0x1f8/0x470
[   32.411421][T1]  ? iommu_dma_get_resv_regions+0x10/0x10
[   32.411421][T1]  __alloc_percpu+0x15/0x20
[   32.411421][T1]  init_iova_flush_queue+0x79/0x230
[   32.411421][T1]  iommu_setup_dma_ops+0x87d/0x890
[   32.431420][T1]  ? __kasan_check_write+0x14/0x20
[   32.431420][T1]  ? refcount_sub_and_test_checked+0xba/0x170
[   32.431420][T1]  ? __kasan_check_write+0x14/0x20
[   32.431420][T1]  ? iommu_dma_alloc+0x1e0/0x1e0
[   32.451420][T1]  ? iommu_group_get_for_dev+0x153/0x450
[   32.451420][T1]  ? refcount_dec_and_test_checked+0x11/0x20
[   32.451420][T1]  ? kobject_put+0x36/0x270
[   32.451420][T1]  amd_iommu_add_device+0x560/0x710
[   32.471423][T1]  ? iommu_probe_device+0x150/0x150
[   32.471423][T1]  iommu_probe_device+0x8c/0x150
[   32.471423][T1]  add_iommu_group+0xe/0x20
[   32.471423][T1]  bus_for_each_dev+0xfe/0x160
[   32.491421][T1]  ? subsys_dev_iter_init+0x80/0x80
[   32.491421][T1]  ? blocking_notifier_chain_register+0x4f/0x70
[   32.491421][T1]  bus_set_iommu+0xc6/0x100
[   32.491421][T1]  ? e820__memblock_setup+0x10e/0x10e
[   32.511571][T1]  amd_iommu_init_api+0x25/0x3e
[   32.511571][T1]  state_next+0x214/0x7ea
[   32.511571][T1]  ? check_flags.part.25+0x86/0x220
[   32.511571][T1]  ? early_amd_iommu_init+0x10c0/0x10c0
[   32.531421][T1]  ? e820__memblock_setup+0x10e/0x10e
[   32.531421][T1]  ? rcu_read_lock_sched_held+0xac/0xe0
[   32.531421][T1]  ? e820__memblock_setup+0x10e/0x10e
[   32.551423][T1]  amd_iommu_init+0x25/0x57
[   32.551423][T1]  pci_iommu_init+0x26/0x62
[   32.551423][T1]  do_one_initcall+0xfe/0x4fa
[   32.551423][T1]  ? perf_trace_initcall_level+0x240/0x240
[   32.571420][T1]  ? rcu_read_lock_sched_held+0xac/0xe0
[   32.571420][T1]  ? rcu_read_lock_bh_held+0xc0/0xc0
[   32.571420][T1]  ? __kasan_check_read+0x11/0x20
[   32.571420][T1]  kernel_init_freeable+0x420/0x4e4
[   32.591420][T1]  ? start_kernel+0x6a9/0x6a9
[   32.591420][T1]  ? lockdep_hardirqs_on+0x1b0/0x2a0
[   32.591420][T1]  ? _raw_spin_unlock_irq+0x27/0x40
[   32.591420][T1]  ? rest_init+0x307/0x307
[   32.611557][T1]  kernel_init+0x11/0x139
[   32.611557][T1]  ? rest_init+0x307/0x307
[   32.611557][T1]  ret_from_fork+0x27/0x50


[   32.0546

[PATCH v11 1/4] kasan: support backing vmalloc space with real shadow memory

2019-10-31 Thread Daniel Axtens
Hook into vmalloc and vmap, and dynamically allocate real shadow
memory to back the mappings.

Most mappings in vmalloc space are small, requiring less than a full
page of shadow space. Allocating a full shadow page per mapping would
therefore be wasteful. Furthermore, to ensure that different mappings
use different shadow pages, mappings would have to be aligned to
KASAN_SHADOW_SCALE_SIZE * PAGE_SIZE.

Instead, share backing space across multiple mappings. Allocate a
backing page when a mapping in vmalloc space uses a particular page of
the shadow region. This page can be shared by other vmalloc mappings
later on.

We hook in to the vmap infrastructure to lazily clean up unused shadow
memory.

To avoid the difficulties around swapping mappings around, this code
expects that the part of the shadow region that covers the vmalloc
space will not be covered by the early shadow page, but will be left
unmapped. This will require changes in arch-specific code.

This allows KASAN with VMAP_STACK, and may be helpful for architectures
that do not have a separate module space (e.g. powerpc64, which I am
currently working on). It also allows relaxing the module alignment
back to PAGE_SIZE.

Testing with test_vmalloc.sh on an x86 VM with 2 vCPUs shows that:

 - Turning on KASAN, inline instrumentation, without vmalloc, introuduces
   a 4.1x-4.2x slowdown in vmalloc operations.

 - Turning this on introduces the following slowdowns over KASAN:
 * ~1.76x slower single-threaded (test_vmalloc.sh performance)
 * ~2.18x slower when both cpus are performing operations
   simultaneously (test_vmalloc.sh sequential_test_order=1)

This is unfortunate but given that this is a debug feature only, not
the end of the world.

Link: https://bugzilla.kernel.org/show_bug.cgi?id=202009
Acked-by: Vasily Gorbik 
Reviewed-by: Andrey Ryabinin 
Co-developed-by: Mark Rutland 
Signed-off-by: Mark Rutland  [shadow rework]
Signed-off-by: Daniel Axtens 

--

v2: let kasan_unpoison_shadow deal with ranges that do not use a
full shadow byte.

v3: relax module alignment
rename to kasan_populate_vmalloc which is a much better name
deal with concurrency correctly

v4: Mark's rework
Poision pages on vfree
Handle allocation failures

v5: Per Christophe Leroy, split out test and dynamically free pages.

v6: Guard freeing page properly. Drop WARN_ON_ONCE(pte_none(*ptep)),
 on reflection it's unnecessary debugging cruft with too high a
 false positive rate.

v7: tlb flush, thanks Mark.
explain more clearly how freeing works and is concurrency-safe.

v9:  - Pull in Uladzislau Rezki's changes to better line up with the
   design of the new vmalloc implementation. Thanks Vlad.
 - clarify comment explaining smp_wmb() per Mark and Andrey's discussion
 - tighten up the allocation of backing memory so that it only
   happens for vmalloc or module  space allocations. Thanks Andrey
   Ryabinin.
 - A TLB flush in the freeing path, thanks Mark Rutland.

v10: - rebase on next, pulling in Vlad's new work on splitting the
   vmalloc locks. This doesn't require changes in our behaviour
   but does require rechecking and rewording the explanation of why
   our behaviour is safe.
 - after much discussion of barriers, I now document where I think they
   are needed and why. Thanks Mark and Andrey.
 - clean up some TLB flushing. We were doing it twice - once after each
   page and once at the end of the whole process. Only do it at the end
   of the whole depopulate process.
 - checkpatch cleanups

v11: Nit from Andrey, tighten up release to vmalloc/module space, thanks Vlad.
 Add benchmark results.

The full benchmark results are:

Performance

  No KASAN  KASAN original x baseline  KASAN 
vmalloc x baselinex KASAN

fix_size_alloc_test169791314229459   8.38   
22981983  13.54   1.62
full_fit_alloc_test184160115152633   8.23   
17902922   9.72   1.18
long_busy_list_alloc_test 1787408258856758   3.29  
103925371   5.81   1.77
random_size_alloc_test 935604729544085   3.16   
57871338   6.19   1.96
fix_align_alloc_test   318896819821620   6.22   
37979436  11.91   1.92
random_size_align_alloc_te 303350717584339   5.80   
32588942  10.74   1.85
align_shift_alloc_test 3251154   3.55   
7263  22.35   6.29
pcpu_alloc_test 231952  278181   1.20 
318977   1.38   1.15
Total Cycles  235852824254985040965542   4.18  
1733258779416   7.35   1.76

Sequential, 2 cpus

  No KASAN  KASAN original x baseline  KASAN 
vmalloc x baselinex KASAN

fix_size_alloc_test250580