On 2020-12-23 20:41, Josef Bacik wrote:
Could you give this a try? I'm not able to reproduce the problem, but I'm
Since I wanted to rule out NVME/block layer/scheduler etc. I tried and could reproduce it immediately, see below. Didn't notice it earlier since most of btrfs is read-mostly.. :(
testing inside of a VM. I'm in the middle of Christmas stuff, but I'll get ahold of a giant machine at work tomorrow and see if I can reproduce there. Meanwhile can you give this a shot? I have a sneaking suspicion why it happens on your baremetal and not in VM's, and this will be a partial enough of a revert of the patch you bisected to validate what I'm thinking. THanks,
The patch doesn't apply to 5.10.x since btrfs_start_delalloc_roots() does not have the trailing true/false argument yet. I removed it, which seemed to have worked. :} Results using -dsingle/-msingle/space tree, all on tmpfs: Unpatched: kernel tree, ~1.1G: $time (cp -a /tmp/linux-5.10.3 /tmp/butter && sync -f /tmp/butter) ( cp -a /tmp/linux-5.10.3 /tmp/butter && sync -f /tmp/butter; ) 0.37s user 3.26s system 6% cpu 52.144 total -> slow as hell since it's thousands of small files. Writeback runs at ~5-10 MB/s. large file: $fallocate -l 2G /tmp/largefile $time (cp -a /tmp/largefile /tmp/butter && sync -f /tmp/butter) ( cp -a /tmp/largefile /tmp/butter && sync -f /tmp/butter; ) 0.00s user 0.91s system 75% cpu 1.215 total -> OK-ish since it's just one big file. With your patch & the 'true' arg to btrfs_start_delalloc_roots() removed: kernel tree: $time (cp -a /tmp/linux-5.10.3 /tmp/butter && sync -f /tmp/butter) ( cp -a /tmp/linux-5.10.3 /tmp/butter && sync -f /tmp/butter; ) 0.28s user 2.44s system 60% cpu 4.475 total rewrite: $time (cp -a /tmp/linux-5.10.3 /tmp/butter && sync -f /tmp/butter) ( cp -a /tmp/linux-5.10.3 /tmp/butter && sync -f /tmp/butter; ) 0.28s user 2.87s system 93% cpu 3.357 total Clearly better. Hope this helps :) Holger
