On Sat, Apr 14, 2018 at 07:12:13AM -0700, Matthew Wilcox wrote:
> From: Matthew Wilcox <[email protected]>
>
> This conversion keeps the radix tree and XArray data structures in sync
> at all times. That allows us to convert the page cache one function at
> a time and should allow for easier bisection. Other than renaming some
> elements of the structures, the data structures are fundamentally
> unchanged; a radix tree walk and an XArray walk will touch the same
> number of cachelines. I have changes planned to the XArray data
> structure, but those will happen in future patches.
I've hit a few failures when throwing this into my test setup. The first two
seem like similar bugs hit in two different ways, the third seems unique.
I've verified that these don't seem to happen with the next-20180413 baseline.
1) Just run some parted commands in a loop:
# while true; do
> parted -s /dev/pmem0 mktable msdos
> parted -s -a optimal /dev/pmem0 mkpart Primary 2MiB 12GiB
> parted -s -a optimal /dev/pmem0 mkpart Primary 12GiB 16382MiB
> done
Within a few seconds I hit:
page:ffffea0004293040 count:0 mapcount:0 mapping:0000000000000000 index:0x0
flags: 0x2fffc000000001(locked)
raw: 002fffc000000001 0000000000000000 0000000000000000 00000000ffffffff
raw: dead000000000100 dead000000000200 0000000000000000 0000000000000000
page dumped because: VM_BUG_ON_PAGE(page_ref_count(page) <= 0)
------------[ cut here ]------------
kernel BUG at ./include/linux/mm.h:853!
invalid opcode: 0000 [#1] PREEMPT SMP PTI
Modules linked in: dax_pmem device_dax nd_pmem nd_btt nfit libnvdimm
CPU: 10 PID: 1539 Comm: systemd-udevd Not tainted
4.16.0-next-20180413-00063-gbbcfa4572878 #2
Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.10.2-2.fc27
04/01/2014
RIP: 0010:__add_to_page_cache_locked+0x34b/0x400
RSP: 0018:ffffc90003427a58 EFLAGS: 00010246
RAX: 000000000000003e RBX: ffffea0004293040 RCX: 0000000000000001
RDX: 0000000000000000 RSI: 0000000000000000 RDI: 00000000ffffffff
RBP: ffffc90003427ac8 R08: 0000001ff3fd4371 R09: 0000000000000000
R10: 0000000000000001 R11: 0000000000000000 R12: ffff88010cd82210
R13: 0000000000000000 R14: 0000000000000000 R15: ffffc90003427ad8
FS: 00007ff6e400c1c0(0000) GS:ffff880115800000(0000) knlGS:0000000000000000
CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
CR2: 000055af28d0e390 CR3: 000000010a64e000 CR4: 00000000000006e0
Call Trace:
? memcg_drain_all_list_lrus+0x260/0x260
add_to_page_cache_lru+0x4f/0xe0
mpage_readpages+0xde/0x1d0
? check_disk_change+0x70/0x70
? xa_load+0xbe/0x150
blkdev_readpages+0x1d/0x20
__do_page_cache_readahead+0x203/0x2f0
force_page_cache_readahead+0xb8/0x110
? force_page_cache_readahead+0xb8/0x110
page_cache_sync_readahead+0x43/0x50
generic_file_read_iter+0x842/0xb70
blkdev_read_iter+0x35/0x40
__vfs_read+0xfe/0x170
vfs_read+0xa3/0x150
ksys_read+0x58/0xc0
__x64_sys_read+0x1a/0x20
do_syscall_64+0x65/0x220
entry_SYSCALL_64_after_hwframe+0x49/0xbe
RIP: 0033:0x7ff6e3bf2d31
RSP: 002b:00007ffc54c38a78 EFLAGS: 00000246 ORIG_RAX: 0000000000000000
RAX: ffffffffffffffda RBX: 000055af28cdb7c0 RCX: 00007ff6e3bf2d31
RDX: 0000000000040000 RSI: 000055af28de66d8 RDI: 000000000000000f
RBP: 0000000000000000 R08: 000055af28de66b0 R09: 00007ff6e3bdd090
R10: 0000000000000000 R11: 0000000000000246 R12: 0000000000040000
R13: 000055af28de66b0 R14: 000055af28cdb810 R15: 000055af28de66c8
Code: 88 fb 55 82 48 89 df e8 64 42 04 00 0f 0b 48 c7 c6 a8 fc 55 82 48 89 df
e8 53 42 04 00 0f 0b 48 c7 c6 18 b5 52 82 e8 45 42 04 00 <0f> 0b 48 c1 f8 02 85
c0 0f 84 59 fe ff ff 45 85 ed 48 c7 43 08
RIP: __add_to_page_cache_locked+0x34b/0x400 RSP: ffffc90003427a58
---[ end trace 0a2ff3a36c6cabde ]---
2) xfs + DAX + generic/095
A spew of this new message:
Page cache invalidation failure on direct I/O. Possible data corruption due to
collision with buffered I/O!
Then a bug similar to the one hit with parted:
BUG: Bad page state in process fio pfn:11e38c
page:ffffea000478e300 count:0 mapcount:0 mapping:0000000000000000 index:0x60
page:ffffea000478e300 count:0 mapcount:0 mapping:0000000000000000 index:0x60
flags: 0x3fffc000010068(uptodate|lru|active|mappedtodisk)
raw: 003fffc000010068 0000000000000000 0000000000000060 00000000ffffffff
raw: ffffea0004ed6220 ffff88011a04d830 0000000000000000 0000000000000000
page dumped because: VM_BUG_ON_PAGE(page_ref_count(page) == 0)
------------[ cut here ]------------
kernel BUG at ./include/linux/mm.h:492!
invalid opcode: 0000 [#1] PREEMPT SMP PTI
Modules linked in: nd_pmem nd_btt dax_pmem device_dax nfit libnvdimm
CPU: 5 PID: 599 Comm: systemd-journal Tainted: G W
4.16.0-next-20180413-00063-gbbcfa4572878 #2
Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.10.2-2.fc27
04/01/2014
RIP: 0010:release_pages+0x30e/0x3f0
RSP: 0018:ffffc90001223a68 EFLAGS: 00010046
RAX: 000000000000003e RBX: ffffea000478e300 RCX: 0000000000000003
RDX: 0000000000000000 RSI: 0000000000000000 RDI: 00000000ffffffff
RBP: ffffc90001223ae8 R08: 0000000000000000 R09: 0000000000000001
R10: ffffffffffffffff R11: 0000000000000000 R12: ffff88013ffe6000
R13: 0000000000000001 R14: ffff880114fdf058 R15: ffffffff82a1a238
FS: 00007f2765945940(0000) GS:ffff880114e00000(0000) knlGS:0000000000000000
CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
CR2: 00007f2760933010 CR3: 000000010d1b6000 CR4: 00000000000006e0
Call Trace:
pagevec_lru_move_fn+0xc5/0xe0
? get_kernel_page+0x60/0x60
lru_add_drain_cpu+0x100/0x130
lru_add_drain+0x1f/0x40
__pagevec_release+0x18/0x30
write_cache_pages+0x442/0x600
? xfs_vm_readpage+0x150/0x150
? lock_acquire+0xa3/0x210
? xfs_vm_writepages+0x48/0xa0
xfs_vm_writepages+0x6b/0xa0
do_writepages+0x4b/0xf0
__filemap_fdatawrite_range+0xc1/0x100
? __filemap_fdatawrite_range+0xc1/0x100
file_write_and_wait_range+0x5a/0xb0
xfs_file_fsync+0x7c/0x310
vfs_fsync_range+0x48/0x80
do_fsync+0x3d/0x70
__x64_sys_fsync+0x14/0x20
do_syscall_64+0x65/0x220
entry_SYSCALL_64_after_hwframe+0x49/0xbe
RIP: 0033:0x7f276552e4cc
RSP: 002b:00007ffd018b1e20 EFLAGS: 00000293 ORIG_RAX: 000000000000004a
RAX: ffffffffffffffda RBX: 00005576c169b030 RCX: 00007f276552e4cc
RDX: 0000000000000000 RSI: 00005576c169b220 RDI: 0000000000000017
RBP: 0000000000000001 R08: 00000000000000ca R09: 00007f275f9329d0
R10: 0000000000000000 R11: 0000000000000293 R12: 00007ffd018b1f80
R13: 00007ffd018b1f78 R14: 00005576c169b030 R15: 00007ffd018b20e8
Code: 1e 45 31 e4 eb cb 48 8b 75 a8 49 8d bc 24 80 66 01 00 45 31 e4 e8 43 37
a3 00 eb b5 48 c7 c6 50 aa 52 82 48 89 df e8 a2 b8 02 00 <0f> 0b 4d 85 e4 74 11
48 8b 75 a8 49 8d bc 24 80 66 01 00 e8 1a
RIP: release_pages+0x30e/0x3f0 RSP: ffffc90001223a68
---[ end trace e0459ef6b39fc1a5 ]---
3) xfs + generic/270 without DAX
This seems to hit a deadlock:
generic/270 run fstests generic/270 at 2018-04-16 09:23:39
XFS (pmem0p2): Mounting V5 Filesystem
XFS (pmem0p2): Ending clean mount
XFS (pmem0p2): Quotacheck needed: Please wait.
XFS (pmem0p2): Quotacheck: Done.
INFO: rcu_preempt detected stalls on CPUs/tasks:
Tasks blocked on level-0 rcu_node (CPUs 0-11): P15438
(detected by 9, t=65002 jiffies, g=315489, c=315488, q=363)
270 R running task 0 15438 1007 0x00000000
Call Trace:
__schedule+0x2b4/0xab0
? trace_hardirqs_on_thunk+0x1a/0x1c
trace_hardirqs_on_thunk+0x1a/0x1c
? retint_kernel+0x2d/0x2d
? xas_descend+0x3a/0x160
? xas_load+0x53/0x100
? xas_find+0x19e/0x230
? find_get_entries+0x1cc/0x2d0
? pagevec_lookup_entries+0x1e/0x30
? invalidate_mapping_pages+0x8d/0x360
? trace_hardirqs_on_thunk+0x1a/0x1c
? lock_acquire+0xa3/0x210
? drop_pagecache_sb+0xa3/0xf0
? drop_pagecache_sb+0x6e/0xf0
? do_coredump+0x1030/0x1030
? iterate_supers+0x96/0x100
? drop_caches_sysctl_handler+0x6e/0xb0
? proc_sys_call_handler+0x101/0x120
? proc_sys_write+0x14/0x20
? __vfs_write+0x3a/0x180
? rcu_sync_lockdep_assert+0x12/0x60
? __sb_start_write+0x184/0x200
? vfs_write+0xc6/0x1c0
? ksys_write+0x58/0xc0
? __x64_sys_write+0x1a/0x20
? do_syscall_64+0x65/0x220
? entry_SYSCALL_64_after_hwframe+0x49/0xbe
270 R running task 0 15438 1007 0x00000000
Call Trace:
__schedule+0x2b4/0xab0
? trace_hardirqs_on_thunk+0x1a/0x1c
? trace_hardirqs_on_thunk+0x1a/0x1c
? trace_hardirqs_on_thunk+0x1a/0x1c
? retint_kernel+0x2d/0x2d
? xas_descend+0x35/0x160
? xas_descend+0x3a/0x160
? xas_load+0x53/0x100
? xas_find+0x19e/0x230
? find_get_entries+0x1cc/0x2d0
? pagevec_lookup_entries+0x1e/0x30
? invalidate_mapping_pages+0x8d/0x360
? trace_hardirqs_on_thunk+0x1a/0x1c
? lock_acquire+0xa3/0x210
? drop_pagecache_sb+0xa3/0xf0
? drop_pagecache_sb+0x6e/0xf0
? do_coredump+0x1030/0x1030
? iterate_supers+0x96/0x100
? drop_caches_sysctl_handler+0x6e/0xb0
? proc_sys_call_handler+0x101/0x120
? proc_sys_write+0x14/0x20
? __vfs_write+0x3a/0x180
? rcu_sync_lockdep_assert+0x12/0x60
? __sb_start_write+0x184/0x200
? vfs_write+0xc6/0x1c0
? ksys_write+0x58/0xc0
? __x64_sys_write+0x1a/0x20
? do_syscall_64+0x65/0x220
? entry_SYSCALL_64_after_hwframe+0x49/0xbe
# echo w > /proc/sysrq-trigger # shows the following blocked task
sysrq: SysRq : Show Blocked State
task PC stack pid father
kworker/9:3 D 0 15778 2 0x80000000
Workqueue: events key_garbage_collector
Call Trace:
__schedule+0x2ac/0xab0
? wait_for_completion+0x109/0x1a0
schedule+0x36/0x90
schedule_timeout+0x251/0x5c0
? lock_acquire+0xa3/0x210
? wait_for_completion+0x47/0x1a0
? wait_for_completion+0x109/0x1a0
wait_for_completion+0x131/0x1a0
? wake_up_q+0x80/0x80
__wait_rcu_gp+0x144/0x180
synchronize_rcu.part.59+0x41/0x60
? kfree_call_rcu+0x30/0x30
? __bpf_trace_rcu_utilization+0x10/0x10
synchronize_rcu+0x2c/0xa0
key_garbage_collector+0x16a/0x410
process_one_work+0x217/0x670
worker_thread+0x3d/0x3b0
kthread+0x12f/0x150
? process_one_work+0x670/0x670
? kthread_create_worker_on_cpu+0x70/0x70
ret_from_fork+0x3a/0x50
------------------------------------------------------------------------------
Check out the vibrant tech community on one of the world's most
engaging tech sites, Slashdot.org! http://sdm.link/slashdot
_______________________________________________
Linux-f2fs-devel mailing list
[email protected]
https://lists.sourceforge.net/lists/listinfo/linux-f2fs-devel