Re: [PATCH 3/3] btrfs: fix pinned underflow after transaction aborted
On Wed, Oct 24, 2018 at 08:24:03PM +0800, Lu Fengqi wrote: > When running generic/475, we may get the following warning in the dmesg. > > [ 6902.102154] WARNING: CPU: 3 PID: 18013 at fs/btrfs/extent-tree.c:9776 > btrfs_free_block_groups+0x2af/0x3b0 [btrfs] > [ 6902.104886] Modules linked in: btrfs(O) xor zstd_decompress zstd_compress > xxhash raid6_pq efivarfs xfs nvme nvme_core [last unloaded: btrfs] > [ 6902.109160] CPU: 3 PID: 18013 Comm: umount Tainted: GW O > 4.19.0-rc8+ #8 > [ 6902.110971] Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS 0.0.0 > 02/06/2015 > [ 6902.112857] RIP: 0010:btrfs_free_block_groups+0x2af/0x3b0 [btrfs] > [ 6902.114377] Code: c6 48 89 04 24 48 8b 83 50 17 00 00 48 39 c6 0f 84 ab 00 > 00 00 4c 8b ab 50 17 00 00 49 83 bd 50 ff ff ff 00 0f 84 b4 00 00 00 <0f> 0b > 31 c9 49 8d b5 f8 fe ff ff 31 d2 48 > 89 df e8 fc 76 ff ff 49 You can remove this > [ 6902.118921] RSP: 0018:c9000459bdb0 EFLAGS: 00010286 > [ 6902.120315] RAX: 880175050bb0 RBX: 8801124a8000 RCX: > 00170007 > [ 6902.121969] RDX: 0002 RSI: 00170007 RDI: > 8125fb74 > [ 6902.123716] RBP: 880175055d10 R08: R09: > > [ 6902.125417] R10: R11: R12: > 880175055d88 > [ 6902.127129] R13: 880175050bb0 R14: R15: > dead0100 > [ 6902.129060] FS: 7f4507223780() GS:88017ba0() > knlGS: > [ 6902.130996] CS: 0010 DS: ES: CR0: 80050033 > [ 6902.132558] CR2: 5623599cac78 CR3: 00014b71 CR4: > 003606e0 > [ 6902.134270] DR0: DR1: DR2: > > [ 6902.135981] DR3: DR6: fffe0ff0 DR7: > 0400 > [ 6902.137836] Call Trace: > [ 6902.138939] close_ctree+0x171/0x330 [btrfs] > [ 6902.140181] ? kthread_stop+0x146/0x1f0 > [ 6902.141277] generic_shutdown_super+0x6c/0x100 > [ 6902.142517] kill_anon_super+0x14/0x30 > [ 6902.143554] btrfs_kill_super+0x13/0x100 [btrfs] > [ 6902.144790] deactivate_locked_super+0x2f/0x70 > [ 6902.146014] cleanup_mnt+0x3b/0x70 > [ 6902.147020] task_work_run+0x9e/0xd0 > [ 6902.148036] do_syscall_64+0x470/0x600 > [ 6902.149142] ? trace_hardirqs_off_thunk+0x1a/0x1c > [ 6902.150375] entry_SYSCALL_64_after_hwframe+0x49/0xbe > [ 6902.151640] RIP: 0033:0x7f45077a6a7b > [ 6902.152782] Code: 23 0c 00 f7 d8 64 89 01 48 83 c8 ff c3 66 90 f3 0f 1e fa > 31 f6 e9 05 00 00 00 90 0f 1f 40 00 f3 0f 1e fa b8 a6 00 00 00 0f 05 <48> 3d > 01 f0 ff ff 73 01 c3 48 8b 0d b5 23 > 0c 00 f7 d8 64 89 01 48 and this line from the changelog (unless there's a reason to keep them). > [ 6902.157324] RSP: 002b:7ffd589f3e68 EFLAGS: 0246 ORIG_RAX: > 00a6 > [ 6902.159187] RAX: RBX: 55e8eec732b0 RCX: > 7f45077a6a7b > [ 6902.160834] RDX: 0001 RSI: RDI: > 55e8eec73490 > [ 6902.162526] RBP: R08: 55e8eec734b0 R09: > 7ffd589f26c0 > [ 6902.164141] R10: R11: 0246 R12: > 55e8eec73490 > [ 6902.165815] R13: 7f4507ac61a4 R14: R15: > 7ffd589f40d8 > [ 6902.167553] irq event stamp: 0 > [ 6902.168998] hardirqs last enabled at (0): [<>] > (null) > [ 6902.170731] hardirqs last disabled at (0): [] > copy_process.part.55+0x3b0/0x1f00 > [ 6902.172773] softirqs last enabled at (0): [] > copy_process.part.55+0x3b0/0x1f00 > [ 6902.174671] softirqs last disabled at (0): [<>] > (null) > [ 6902.176407] ---[ end trace 463138c2986b275c ]--- > [ 6902.177636] BTRFS info (device dm-3): space_info 4 has 273465344 free, is > not full > [ 6902.179453] BTRFS info (device dm-3): space_info total=276824064, > used=4685824, pinned=18446744073708158976, reserved=0, may_use=0, > readonly=65536 > > ^^^ > > obviously underflow ^ I'll reformat that a bit so the text is actually in the visible range | and we don't need to put signs like - > When transaction_kthread is running cleanup_transaction(), another > fsstress is running btrfs_commit_transaction(). The > btrfs_finish_extent_commit() may get the same range as > btrfs_destroy_pinned_extent() got, which causes the pinned underflow. So this completes what d4b450cd4b33 ("Btrfs: fix race between transaction commit and empty block group removal") fixed in the automatic block group removal 47ab2a6c689913d. I'll add the stable tags too, and queue it for 4.20. Thanks.
[PATCH 3/3] btrfs: fix pinned underflow after transaction aborted
When running generic/475, we may get the following warning in the dmesg. [ 6902.102154] WARNING: CPU: 3 PID: 18013 at fs/btrfs/extent-tree.c:9776 btrfs_free_block_groups+0x2af/0x3b0 [btrfs] [ 6902.104886] Modules linked in: btrfs(O) xor zstd_decompress zstd_compress xxhash raid6_pq efivarfs xfs nvme nvme_core [last unloaded: btrfs] [ 6902.109160] CPU: 3 PID: 18013 Comm: umount Tainted: GW O 4.19.0-rc8+ #8 [ 6902.110971] Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS 0.0.0 02/06/2015 [ 6902.112857] RIP: 0010:btrfs_free_block_groups+0x2af/0x3b0 [btrfs] [ 6902.114377] Code: c6 48 89 04 24 48 8b 83 50 17 00 00 48 39 c6 0f 84 ab 00 00 00 4c 8b ab 50 17 00 00 49 83 bd 50 ff ff ff 00 0f 84 b4 00 00 00 <0f> 0b 31 c9 49 8d b5 f8 fe ff ff 31 d2 48 89 df e8 fc 76 ff ff 49 [ 6902.118921] RSP: 0018:c9000459bdb0 EFLAGS: 00010286 [ 6902.120315] RAX: 880175050bb0 RBX: 8801124a8000 RCX: 00170007 [ 6902.121969] RDX: 0002 RSI: 00170007 RDI: 8125fb74 [ 6902.123716] RBP: 880175055d10 R08: R09: [ 6902.125417] R10: R11: R12: 880175055d88 [ 6902.127129] R13: 880175050bb0 R14: R15: dead0100 [ 6902.129060] FS: 7f4507223780() GS:88017ba0() knlGS: [ 6902.130996] CS: 0010 DS: ES: CR0: 80050033 [ 6902.132558] CR2: 5623599cac78 CR3: 00014b71 CR4: 003606e0 [ 6902.134270] DR0: DR1: DR2: [ 6902.135981] DR3: DR6: fffe0ff0 DR7: 0400 [ 6902.137836] Call Trace: [ 6902.138939] close_ctree+0x171/0x330 [btrfs] [ 6902.140181] ? kthread_stop+0x146/0x1f0 [ 6902.141277] generic_shutdown_super+0x6c/0x100 [ 6902.142517] kill_anon_super+0x14/0x30 [ 6902.143554] btrfs_kill_super+0x13/0x100 [btrfs] [ 6902.144790] deactivate_locked_super+0x2f/0x70 [ 6902.146014] cleanup_mnt+0x3b/0x70 [ 6902.147020] task_work_run+0x9e/0xd0 [ 6902.148036] do_syscall_64+0x470/0x600 [ 6902.149142] ? trace_hardirqs_off_thunk+0x1a/0x1c [ 6902.150375] entry_SYSCALL_64_after_hwframe+0x49/0xbe [ 6902.151640] RIP: 0033:0x7f45077a6a7b [ 6902.152782] Code: 23 0c 00 f7 d8 64 89 01 48 83 c8 ff c3 66 90 f3 0f 1e fa 31 f6 e9 05 00 00 00 90 0f 1f 40 00 f3 0f 1e fa b8 a6 00 00 00 0f 05 <48> 3d 01 f0 ff ff 73 01 c3 48 8b 0d b5 23 0c 00 f7 d8 64 89 01 48 [ 6902.157324] RSP: 002b:7ffd589f3e68 EFLAGS: 0246 ORIG_RAX: 00a6 [ 6902.159187] RAX: RBX: 55e8eec732b0 RCX: 7f45077a6a7b [ 6902.160834] RDX: 0001 RSI: RDI: 55e8eec73490 [ 6902.162526] RBP: R08: 55e8eec734b0 R09: 7ffd589f26c0 [ 6902.164141] R10: R11: 0246 R12: 55e8eec73490 [ 6902.165815] R13: 7f4507ac61a4 R14: R15: 7ffd589f40d8 [ 6902.167553] irq event stamp: 0 [ 6902.168998] hardirqs last enabled at (0): [<>] (null) [ 6902.170731] hardirqs last disabled at (0): [] copy_process.part.55+0x3b0/0x1f00 [ 6902.172773] softirqs last enabled at (0): [] copy_process.part.55+0x3b0/0x1f00 [ 6902.174671] softirqs last disabled at (0): [<>] (null) [ 6902.176407] ---[ end trace 463138c2986b275c ]--- [ 6902.177636] BTRFS info (device dm-3): space_info 4 has 273465344 free, is not full [ 6902.179453] BTRFS info (device dm-3): space_info total=276824064, used=4685824, pinned=18446744073708158976, reserved=0, may_use=0, readonly=65536 ^^^ obviously underflow When transaction_kthread is running cleanup_transaction(), another fsstress is running btrfs_commit_transaction(). The btrfs_finish_extent_commit() may get the same range as btrfs_destroy_pinned_extent() got, which causes the pinned underflow. Signed-off-by: Lu Fengqi --- fs/btrfs/disk-io.c | 12 +++- 1 file changed, 11 insertions(+), 1 deletion(-) diff --git a/fs/btrfs/disk-io.c b/fs/btrfs/disk-io.c index b0ab41da91d1..00ee5e37e989 100644 --- a/fs/btrfs/disk-io.c +++ b/fs/btrfs/disk-io.c @@ -4359,13 +4359,23 @@ static int btrfs_destroy_pinned_extent(struct btrfs_fs_info *fs_info, unpin = pinned_extents; again: while (1) { + /* +* The btrfs_finish_extent_commit() may get the same range as +* ours between find_first_extent_bit and clear_extent_dirty. +* Hence, hold the unused_bg_unpin_mutex to avoid double unpin +* the same extent range. +*/ + mutex_lock(_info->unused_bg_unpin_mutex); ret = find_first_extent_bit(unpin, 0, , , EXTENT_DIRTY, NULL); -