On 2018/08/24 16:58, Qu Wenruo wrote:
> 
> 
> On 2018/8/24 下午3:54, Misono Tomohiro wrote:
>> On 2018/08/24 16:20, Qu Wenruo wrote:
>>>
>>>
>>> On 2018/8/24 下午3:14, Misono Tomohiro wrote:
>>>> Hi,
>>>>
>>>> On 2018/08/21 14:40, Qu Wenruo wrote:
>>>>> Commit c6887cd11149 ("Btrfs: don't do nocow check unless we have to")
>>>>> makes nocow check less frequent to improve performance.
>>>>>
>>>>> However for quota enabled case, such optimization could lead to extra
>>>>> unnecessary data reservation, which results failure for test case like
>>>>> btrfs/153 in fstests.
>>>>>
>>>>> Fix it by reverting to old behavior for quota enabled case.
>>>>>
>>>>> Fixes: c6887cd11149 ("Btrfs: don't do nocow check unless we have to")
>>>>> Signed-off-by: Qu Wenruo <w...@suse.com>
>>>>> ---
>>>>> changelog
>>>>> v2:
>>>>>   Fix regression for quota+cow case. (Previously it will skip data
>>>>>   reservation if quota is enabled, causing regression for limit case.
>>>>>   Pointed out by Misono)
>>>>> ---
>>>>>  fs/btrfs/file.c | 18 +++++++++++++++++-
>>>>>  1 file changed, 17 insertions(+), 1 deletion(-)
>>>>>
>>>>> diff --git a/fs/btrfs/file.c b/fs/btrfs/file.c
>>>>> index 2be00e873e92..183e5fb96f42 100644
>>>>> --- a/fs/btrfs/file.c
>>>>> +++ b/fs/btrfs/file.c
>>>>> @@ -1584,6 +1584,7 @@ static noinline ssize_t btrfs_buffered_write(struct 
>>>>> kiocb *iocb,
>>>>>   int ret = 0;
>>>>>   bool only_release_metadata = false;
>>>>>   bool force_page_uptodate = false;
>>>>> + bool quota_enabled = test_bit(BTRFS_FS_QUOTA_ENABLED, &fs_info->flags);
>>>>>  
>>>>>   nrptrs = min(DIV_ROUND_UP(iov_iter_count(i), PAGE_SIZE),
>>>>>                   PAGE_SIZE / (sizeof(struct page *)));
>>>>> @@ -1624,13 +1625,28 @@ static noinline ssize_t 
>>>>> btrfs_buffered_write(struct kiocb *iocb,
>>>>>                           fs_info->sectorsize);
>>>>>  
>>>>>           extent_changeset_release(data_reserved);
>>>>> +
>>>>> +         /*
>>>>> +          * If we have quota enabled, we must do the heavy lift nocow
>>>>> +          * check here to avoid reserving data space, or we can hit
>>>>> +          * limitation for NOCOW files.
>>>>> +          */
>>>>> +         if (quota_enabled) {
>>>>> +                 if ((BTRFS_I(inode)->flags & (BTRFS_INODE_NODATACOW |
>>>>> +                                               BTRFS_INODE_PREALLOC)) &&
>>>>> +                     check_can_nocow(BTRFS_I(inode), pos,
>>>>> +                                     &write_bytes) > 0)
>>>>> +                         goto reserve_meta_only;
>>>>> +         }
>>>>>           ret = btrfs_check_data_free_space(inode, &data_reserved, pos,
>>>>>                                             write_bytes);
>>>>>           if (ret < 0) {
>>>>>                   if ((BTRFS_I(inode)->flags & (BTRFS_INODE_NODATACOW |
>>>>>                                                 BTRFS_INODE_PREALLOC)) &&
>>>>>                       check_can_nocow(BTRFS_I(inode), pos,
>>>>> -                                 &write_bytes) > 0) {
>>>>> +                                 &write_bytes) > 0 &&
>>>>
>>>>> +                     !quota_enabled) {
>>>>
>>>> (When we check this condition, quota_enabled must be false or otherwise
>>>> we have already goto reserve_meta_only. So it seems redundant.)
>>>
>>> It's possible that we have quota enabled, and then
>>> btrfs_check_data_free_space() failed with -EDQUOT.
>>>
>>> In that case, we need above !quota_enabled check to avoid unnecessary
>>> check and just go error branch.
>>
>> So should quota_enabled be checked before check_can_nocow()?
> 
> Oh, yes, it should be put before nocow check.
> 
>>
>>>
>>>>
>>>>> +reserve_meta_only:
>>>>>                           /*
>>>>>                            * For nodata cow case, no need to reserve
>>>>>                            * data space.
>>>>>
>>>>
>>>> I applied this patch on today's misc-next and it seems mostly ok, but
>>>> btrfs/022 sometimes gives following warning:
>>>
>>> This looks like related to the regression caused by commit
>>> c4c129db5da8f070147f175 ("btrfs: drop unused
>>> parameter qgroup_reserved").
>>>
>>> Would you please try reverting that patch?
>>
>> I think above commit is fixed by commit eb27db470 ("btrfs: fix
>> qgroup_free wrong num_bytes in btrfs_subvolume_reserve_metadata") which
>> is already in misc-next too.
>>
>> I reverted above two patch (and one more related patch 6b0cb14901
>> ("btrfs: drop useless member qgroup_reserved of btrfs_pending_snapshot")),
>> but get the same result.
> 
> So something really goes wrong.
> I assume it's some error handler which double freed, but a quick glance
> nor my initial test run doesn't show something obvious.
> 
> BTW, what's the possibility of such problem in your test environment?

It's like one in several times.
It may depend on hardware performance? (the machine is not so fast),

I also noticed following warning happens too (not always):

[84089.286669] WARNING: CPU: 4 PID: 19255 at fs/btrfs/extent-tree.c:4277 
btrfs_free_reserved_data_space_noquota+0xd2/0xf0 [btrfs]
[84089.286670] Modules linked in: btrfs(O) xor zstd_decompress zstd_compress 
xxhash raid6_pq dm_thin_pool dm_persistent_data dm_bio_prison loop dm_flakey 
xt_CHECKSUM ipt_MASQUERADE tun bridge stp llc xt_conntrack ip_set nfnetlink 
iptable_nat nf_conntrack_ipv4 nf_defrag_ipv4 nf_nat_ipv4 nf_nat nf_conntrack 
iptable_mangle iptable_raw iptable_security sunrpc intel_powerclamp kvm_intel 
kvm gpio_ich iTCO_wdt ipmi_ssif iTCO_vendor_support ipmi_si st irqbypass 
ipmi_devintf crct10dif_pclmul crc32_pclmul ipmi_msghandler ghash_clmulni_intel 
pcspkr acpi_power_meter i2c_i801 pcc_cpufreq i7core_edac lpc_ich acpi_cpufreq 
xfs libcrc32c mgag200 drm_kms_helper syscopyarea sysfillrect sysimgblt 
fb_sys_fops ttm drm igb sr_mod hwmon uas ptp crc32c_intel cdrom usb_storage 
pps_core ata_generic megaraid_sas dca pata_acpi
[84089.286708]  i2c_algo_bit ipv6 [last unloaded: xor]
[84089.286712] CPU: 4 PID: 19255 Comm: kworker/u25:1 Tainted: G        W IO     
 4.18.0-rc8+ #98
[84089.286713] Hardware name: FUJITSU-SV                       PRIMERGY RX300 
S6             /D2619, BIOS 6.00 Rev. 1.09.2619.N1           12/13/2010
[84089.286717] Workqueue: writeback wb_workfn (flush-btrfs-474)
[84089.286731] RIP: 0010:btrfs_free_reserved_data_space_noquota+0xd2/0xf0 
[btrfs]
[84089.286731] Code: 49 89 d8 4c 89 f1 4c 89 fa 4c 89 e6 e8 27 51 d1 d2 48 8b 
45 00 48 85 c0 75 db 41 c6 45 00 00 5b 5d 41 5c 41 5d 41 5e 41 5f c3 <0f> 0b 49 
c7 45 28 00 00 00 00 e9 79 ff ff ff 0f 1f 44 00 00 66 2e
[84089.286752] RSP: 0018:ffff8ea1043176e0 EFLAGS: 00010287
[84089.286754] RAX: 0000000000004000 RBX: 0000000000005000 RCX: ffff8c1151e8cae8
[84089.286755] RDX: 0000000000000001 RSI: 00000000000c3000 RDI: ffff8c106602fa00
[84089.286755] RBP: ffffffffffffb000 R08: 0000000000000000 R09: 0000000000000143
[84089.286756] R10: ffff8c11dd95d000 R11: 00000000ffffff01 R12: ffff8c1151e80000
[84089.286757] R13: ffff8c106602fa00 R14: ffff8ea1043177e4 R15: ffff8c1151e80000
[84089.286758] FS:  0000000000000000(0000) GS:ffff8c1137d00000(0000) 
knlGS:0000000000000000
[84089.286759] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[84089.286760] CR2: 00007fcbb3a26000 CR3: 000000019400a002 CR4: 00000000000206e0
[84089.286761] Call Trace:
[84089.286779]  btrfs_clear_bit_hook+0x1a2/0x3f0 [btrfs]
[84089.286796]  clear_state_bit+0x51/0x1b0 [btrfs]
[84089.286812]  __clear_extent_bit+0x364/0x3c0 [btrfs]
[84089.286841]  extent_clear_unlock_delalloc+0x43/0x70 [btrfs]
[84089.286868]  run_delalloc_nocow+0x930/0xa60 [btrfs]
[84089.286897]  run_delalloc_range+0x19c/0x370 [btrfs]
[84089.286926]  writepage_delalloc+0xcf/0x180 [btrfs]
[84089.286955]  __extent_writepage+0x17a/0x310 [btrfs]
[84089.286984]  extent_write_cache_pages+0x131/0x390 [btrfs]
[84089.287012]  ? btrfs_i_callback+0x20/0x20 [btrfs]
[84089.287040]  extent_writepages+0x50/0x80 [btrfs]
[84089.287045]  do_writepages+0x4b/0xe0
[84089.287073]  ? btrfs_get_token_32+0x6b/0x130 [btrfs]
[84089.287077]  ? __writeback_single_inode+0x3d/0x320
[84089.287079]  __writeback_single_inode+0x3d/0x320
[84089.287082]  writeback_sb_inodes+0x19f/0x460
[84089.287086]  wb_writeback+0x107/0x300
[84089.287090]  ? wb_workfn+0xdf/0x410
[84089.287093]  ? current_is_workqueue_rescuer+0x27/0x40
[84089.287095]  wb_workfn+0xdf/0x410
[84089.287099]  ? put_prev_entity+0x20/0x100
[84089.287103]  process_one_work+0x18f/0x370
[84089.287106]  worker_thread+0x30/0x380
[84089.287109]  ? process_one_work+0x370/0x370
[84089.287112]  kthread+0x113/0x130
[84089.287115]  ? kthread_create_worker_on_cpu+0x70/0x70
[84089.287119]  ret_from_fork+0x35/0x40
[84089.287122] ---[ end trace b5975ec96174c615 ]---


> 
> Thanks,
> Qu
> 
>>
>> Thanks,
>> Misono
>>
>>>
>>> Thanks,
>>> Qu
>>>
>>>
>>>>
>>>> [80244.152130] WARNING: CPU: 5 PID: 14575 at fs/btrfs/extent-tree.c:9742 
>>>> btrfs_free_block_groups+0x2d7/0x440 [btrfs]
>>>> [80244.152132] Modules linked in: btrfs(O) xor zstd_decompress 
>>>> zstd_compress xxhash raid6_pq xt_CHECKSUM ipt_MASQUERADE tun bridge stp 
>>>> llc xt_conntrack ip_set nfnetlink iptable_nat nf_conntrack_ipv4 
>>>> nf_defrag_ipv4 nf_nat_ipv4 nf_nat nf_conntrack iptable_mangle iptable_raw 
>>>> iptable_security sunrpc intel_powerclamp kvm_intel kvm gpio_ich iTCO_wdt 
>>>> ipmi_ssif iTCO_vendor_support ipmi_si st irqbypass ipmi_devintf 
>>>> crct10dif_pclmul crc32_pclmul ipmi_msghandler ghash_clmulni_intel pcspkr 
>>>> acpi_power_meter i2c_i801 pcc_cpufreq i7core_edac lpc_ich acpi_cpufreq xfs 
>>>> libcrc32c mgag200 drm_kms_helper syscopyarea sysfillrect sysimgblt 
>>>> fb_sys_fops ttm drm igb sr_mod hwmon uas ptp crc32c_intel cdrom 
>>>> usb_storage pps_core ata_generic megaraid_sas dca pata_acpi i2c_algo_bit 
>>>> ipv6 [last unloaded: xor]
>>>> [80244.152185] CPU: 5 PID: 14575 Comm: umount Tainted: G        W IO      
>>>> 4.18.0-rc8+ #98
>>>> [80244.152187] Hardware name: FUJITSU-SV                       PRIMERGY 
>>>> RX300 S6             /D2619, BIOS 6.00 Rev. 1.09.2619.N1           
>>>> 12/13/2010
>>>> [80244.152205] RIP: 0010:btrfs_free_block_groups+0x2d7/0x440 [btrfs]
>>>> [80244.152206] Code: 85 20 cb 00 00 48 39 c6 0f 84 b9 00 00 00 49 bf 00 01 
>>>> 00 00 00 00 ad de 48 8b 9d 20 cb 00 00 48 83 7b a0 00 0f 84 0d 01 00 00 
>>>> <0f> 0b 48 8d 73 88 31 c9 31 d2 48 89 ef e8 27 7a ff ff 48 89 df e8
>>>> [80244.152235] RSP: 0018:ffff8ea10393fdb0 EFLAGS: 00010286
>>>> [80244.152237] RAX: ffff8c1025819e78 RBX: ffff8c1025819e78 RCX: 
>>>> 0000000000000000
>>>> [80244.152238] RDX: 0000000000000001 RSI: ffff8c115329cb20 RDI: 
>>>> ffff8c1025818e00
>>>> [80244.152239] RBP: ffff8c1153290000 R08: 0000000000000000 R09: 
>>>> 0000000000000000
>>>> [80244.152240] R10: ffff8c1025818e98 R11: 0000000000000002 R12: 
>>>> ffff8c11532900a0
>>>> [80244.152241] R13: 0000000000000000 R14: dead000000000200 R15: 
>>>> dead000000000100
>>>> [80244.152243] FS:  00007ff0cc3d8fc0(0000) GS:ffff8c1137d40000(0000) 
>>>> knlGS:0000000000000000
>>>> [80244.152244] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
>>>> [80244.152245] CR2: 00007ffe75e53c88 CR3: 0000000235d16001 CR4: 
>>>> 00000000000206e0
>>>> [80244.152246] Call Trace:
>>>> [80244.152270]  close_ctree+0x146/0x320 [btrfs]
>>>> [80244.152276]  ? kthread_stop+0x42/0xf0
>>>> [80244.152280]  generic_shutdown_super+0x6c/0x110
>>>> [80244.152283]  kill_anon_super+0xe/0x20
>>>> [80244.152298]  btrfs_kill_super+0x13/0x100 [btrfs]
>>>> [80244.152301]  deactivate_locked_super+0x3f/0x70
>>>> [80244.152303]  cleanup_mnt+0x3b/0x70
>>>> [80244.152305]  task_work_run+0x84/0xa0
>>>> [80244.152308]  do_syscall_64+0x143/0x4cd
>>>> [80244.152311]  ? do_page_fault+0x31/0x130
>>>> [80244.152314]  entry_SYSCALL_64_after_hwframe+0x44/0xa9
>>>> [80244.152316] RIP: 0033:0x7ff0cb43c1a7
>>>> [80244.152317] Code: ad 2b 00 f7 d8 64 89 01 48 83 c8 ff c3 66 0f 1f 44 00 
>>>> 00 31 f6 e9 09 00 00 00 66 0f 1f 84 00 00 00 00 00 b8 a6 00 00 00 0f 05 
>>>> <48> 3d 01 f0 ff ff 73 01 c3 48 8b 0d c9 ac 2b 00 f7 d8 64 89 01 48
>>>> [80244.152346] RSP: 002b:00007ffe75e554b8 EFLAGS: 00000246 ORIG_RAX: 
>>>> 00000000000000a6
>>>> [80244.152348] RAX: 0000000000000000 RBX: 0000563ccced82d0 RCX: 
>>>> 00007ff0cb43c1a7
>>>> [80244.152349] RDX: 0000000000000001 RSI: 0000000000000000 RDI: 
>>>> 0000563ccced84b0
>>>> [80244.152350] RBP: 0000563ccced84b0 R08: 0000000000000005 R09: 
>>>> 0000563ccced84d0
>>>> [80244.152351] R10: 00007ff0cb4ba320 R11: 0000000000000246 R12: 
>>>> 00007ff0cc1d0184
>>>> [80244.152352] R13: 0000000000000000 R14: 0000000000000000 R15: 
>>>> 0000000000000000
>>>> [80244.152354] ---[ end trace b5975ec96174c60c ]---
>>>> [80244.152358] BTRFS info (device sdh2): space_info 1 has 1022132224 free, 
>>>> is not full
>>>> [80244.152360] BTRFS info (device sdh2): space_info total=1082130432, 
>>>> used=60039168, pinned=0, reserved=0, may_use=18446744073709510656, 
>>>> readonly=0
>>>>
>>>> Obviously, may_use value is underflowed.
>>>>
>>>> Thanks,
>>>> Misono
>>>>
>>>>
>>>
>>
> 

Reply via email to