On 3.02.21 г. 12:51 ч., Filipe Manana wrote:
> On Tue, Feb 2, 2021 at 2:15 PM Wang Yugui <wangyu...@e16-tech.com> wrote:
>>
>> Hi, Filipe Manana
>>
>> There are some dbench(sync mode) result on the same hardware,
>> but with different linux kernel
>>
>> 4.14.200
>> Operation Count AvgLat MaxLat
>> ----------------------------------------
>> WriteX 225281 5.163 82.143
>> Flush 32161 2.250 62.669
>> Throughput 236.719 MB/sec (sync open) 32 clients 32 procs
>> max_latency=82.149 ms
>>
>> 4.19.21
>> Operation Count AvgLat MaxLat
>> ----------------------------------------
>> WriteX 118842 10.946 116.345
>> Flush 16506 0.115 44.575
>> Throughput 125.973 MB/sec (sync open) 32 clients 32 procs
>> max_latency=116.390 ms
>>
>> 4.19.150
>> Operation Count AvgLat MaxLat
>> ----------------------------------------
>> WriteX 144509 9.151 117.353
>> lush 20563 0.128 52.014
>> Throughput 153.707 MB/sec (sync open) 32 clients 32 procs
>> max_latency=117.379 ms
>>
>> 5.4.91
>> Operation Count AvgLat MaxLat
>> ----------------------------------------
>> WriteX 367033 4.377 1908.724
>> Flush 52037 0.159 39.871
>> Throughput 384.554 MB/sec (sync open) 32 clients 32 procs
>> max_latency=1908.968 ms
>
> Ok, it seems somewhere between 4.19 and 5.4, something made the
> latency much worse for you at least.
>
> Is it only when using sync open (O_SYNC, dbench's -s flag), what about
> when not using it?
>
> I'll have to look at it, but it will likely take some time.
This seems like the perf regression I observed starting with kernel 5.0,
essentially preemptive flush of metadata was broken for quite some time,
but kernel 5.0 removed a btrfs_end_transaction call from
should_end_transaction which unmasked the issue.
In particular this should have been fixed by the following commit in
misc-next:
https://github.com/kdave/btrfs-devel/commit/28d7e221e4323a5b98e5d248eb5603ff5206a188
which is part of a larger series of patches. So Wang, in order to test
this hypothesis can you re-run those tests with the latest misc-next
branch .
<snip>