Re: Btrfs + compression = slow performance and high cpu usage
Hello again list. I thought I would clear the things out and describe what is happening with my troubled RAID setup. So having received the help from the list, I've initially run the full defragmentation of all the data and recompressed everything with zlib. That didn't help. Then I run the full rebalance of the data and that didn't help either. So I had to take a disk out of the raid, copy all the data onto it, recreate the RAID drive with 32kb chunk size and 96kb stripe and copied the data back. Then added the disk back and resynced the raid. So currently the RAID device is Adapter 0 -- Virtual Drive Information: Virtual Drive: 0 (Target Id: 0) Name: RAID Level : Primary-5, Secondary-0, RAID Level Qualifier-3 Size: 21.830 TB Sector Size : 512 Is VD emulated : Yes Parity Size : 7.276 TB State : Optimal Strip Size : 32 KB Number Of Drives: 4 Span Depth : 1 Default Cache Policy: WriteBack, ReadAhead, Direct, No Write Cache if Bad BBU Current Cache Policy: WriteBack, ReadAhead, Direct, No Write Cache if Bad BBU Default Access Policy: Read/Write Current Access Policy: Read/Write Disk Cache Policy : Disk's Default Encryption Type : None Bad Blocks Exist: No Is VD Cached: No It is about 40% full with compressed data # btrfs fi usage /mnt/arh-backup1/ Overall: Device size: 21.83TiB Device allocated: 8.98TiB Device unallocated: 12.85TiB Device missing: 0.00B Used: 8.98TiB Free (estimated): 12.85TiB (min: 6.43TiB) Data ratio: 1.00 Metadata ratio: 2.00 Global reserve: 512.00MiB (used: 0.00B) I've decided to run a set of test, where 5 gb file was created using different blocksizes and different flags. one file with urandom data was generated and another one filled with zeroes. the data was written with compression and without compression, and it seems that without compression it is possible to gain 30-40% speed, while the cpu was running at 50% idle during the highest loads. dd write speeds (mb/s) flags: conv=fsync compress-force=zlib compress-force=none RAND ZERORAND ZERO bs1024k 387 407 584 577 bs512k 389 414 532 547 bs256k 412 409 558 585 bs128k 412 403 572 583 bs64k409 419 563 574 bs32k407 404 569 572 flags: oflag=sync compress-force=zlib compress-force=none RAND ZERORAND ZERO bs1024k 86.1 97.0203 210 bs512k 50.6 64.485.0 170 bs256k 25.0 29.867.6 67.5 bs128k 13.2 16.448.4 49.8 bs64k7.4 8.3 24.5 27.9 bs32k3.8 4.1 14.0 13.7 flags: no flags compress-force=zlib compress-force=none RAND ZERORAND ZERO bs1024k 480 419 681 595 bs512k 422 412 633 585 bs256k 413 384 707 712 bs128k 414 387 695 704 bs64k482 467 622 587 bs32k416 412 610 598 I have also run a test where I filled the array to about 97% capacity and the write speed went down by about 50% compared with the empty RAID. thanks for the help. - Original Message - From: "Peter Grandi" <p...@btrfs.list.sabi.co.uk> To: "Linux fs Btrfs" <linux-btrfs@vger.kernel.org> Sent: Tuesday, 1 August, 2017 10:09:03 PM Subject: Re: Btrfs + compression = slow performance and high cpu usage >> [ ... ] a "RAID5 with 128KiB writes and a 768KiB stripe >> size". [ ... ] several back-to-back 128KiB writes [ ... ] get >> merged by the 3ware firmware only if it has a persistent >> cache, and maybe your 3ware does not have one, > KOS: No I don't have persistent cache. Only the 512 Mb cache > on board of a controller, that is BBU. If it is a persistent cache, that can be battery-backed (as I wrote, but it seems that you don't have too much time to read replies) then the size of the write, 128KiB or not, should not matter much; the write will be reported complete when it hits the persistent cache (whichever technology it used), and then the HA fimware will spill write cached data to the disks using the optimal operation width. Unless the 3ware firmware is really terrible (and depending on model and vintage it can be amazingly terrible) or the battery is no longer recharging and then the host adapter switches to write-through. That you see very different rates between uncompressed and compressed writes, where the main difference is the limitation on the segment size, seems to indicate that compressed writes involve a lot of RMW, that is sub-stripe updates. As I mentioned already, it would be interesting to retry 'dd' with different 'bs' values without compression and with 'sync' (or 'direct' which only makes sense without compression). > If I had additional SSD caching o
Re: Btrfs + compression = slow performance and high cpu usage
[ ... ] > This is the "storage for beginners" version, what happens in > practice however depends a lot on specific workload profile > (typical read/write size and latencies and rates), caching and > queueing algorithms in both Linux and the HA firmware. To add a bit of slightly more advanced discussion, the main reason for larger strips ("chunk size) is to avoid the huge latencies of disk rotation using unsynchronized disk drives, as detailed here: http://www.sabi.co.uk/blog/12-thr.html?120310#120310 That relates weakly to Btrfs. -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: Btrfs + compression = slow performance and high cpu usage
>> [ ... ] a "RAID5 with 128KiB writes and a 768KiB stripe >> size". [ ... ] several back-to-back 128KiB writes [ ... ] get >> merged by the 3ware firmware only if it has a persistent >> cache, and maybe your 3ware does not have one, > KOS: No I don't have persistent cache. Only the 512 Mb cache > on board of a controller, that is BBU. If it is a persistent cache, that can be battery-backed (as I wrote, but it seems that you don't have too much time to read replies) then the size of the write, 128KiB or not, should not matter much; the write will be reported complete when it hits the persistent cache (whichever technology it used), and then the HA fimware will spill write cached data to the disks using the optimal operation width. Unless the 3ware firmware is really terrible (and depending on model and vintage it can be amazingly terrible) or the battery is no longer recharging and then the host adapter switches to write-through. That you see very different rates between uncompressed and compressed writes, where the main difference is the limitation on the segment size, seems to indicate that compressed writes involve a lot of RMW, that is sub-stripe updates. As I mentioned already, it would be interesting to retry 'dd' with different 'bs' values without compression and with 'sync' (or 'direct' which only makes sense without compression). > If I had additional SSD caching on the controller I would have > mentioned it. So far you had not mentioned the presence of BBU cache either, which is equivalent, even if in one of your previous message (which I try to read carefully) there were these lines: Default Cache Policy: WriteBack, ReadAhead, Direct, No Write Cache if Bad BBU Current Cache Policy: WriteBack, ReadAhead, Direct, No Write Cache if Bad BBU So perhaps someone else would have checked long ago the status of the BBU and whether the "No Write Cache if Bad BBU" case has happened. If the BBU is still working and the policy is still "WriteBack" then things are stranger still. > I was also under impression, that in a situation where mostly > extra large files will be stored on the massive, the bigger > strip size would indeed increase the speed, thus I went with > with the 256 Kb strip size. That runs counter to this simple story: suppose a program is doing 64KiB IO: * For *reads*, there are 4 data drives and the strip size is 16KiB: the 64KiB will be read in parallel on 4 drives. If the strip size is 256KiB then the 64KiB will be read sequentially from just one disk, and 4 successive reads will be read sequentially from the same drive. * For *writes* on a parity RAID like RAID5 things are much, much more extreme: the 64KiB will be written with 16KiB strips on a 5-wide RAID5 set in parallel to 5 drives, with 4 stripes being updated with RMW. But with 256KiB strips it will partially update 5 drives, because the stripe is 1024+256KiB, and it needs to do RMW, and four successive 64KiB drives will need to do that too, even if only one drive is updated. Usually for RAID5 there is an optimization that means that only the specific target drive and the parity drives(s) need RMW, but it is still very expensive. This is the "storage for beginners" version, what happens in practice however depends a lot on specific workload profile (typical read/write size and latencies and rates), caching and queueing algorithms in both Linux and the HA firmware. > Would I be correct in assuming that the RAID strip size of 128 > Kb will be a better choice if one plans to use the BTRFS with > compression? That would need to be tested, because of "depends a lot on specific workload profile, caching and queueing algorithms", but my expectation is the the lower the better. Given that you have 4 drives giving a 3+1 RAID set, perhaps a 32KiB or 64KiB strip size, given a data stripe size of 96KiB or 192KiB, would be better. -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: Btrfs + compression = slow performance and high cpu usage
- Original Message - From: "Peter Grandi" <p...@btrfs.list.sabi.co.uk> To: "Linux fs Btrfs" <linux-btrfs@vger.kernel.org> Sent: Tuesday, 1 August, 2017 3:14:07 PM Subject: Re: Btrfs + compression = slow performance and high cpu usage > Peter, I don't think the filefrag is showing the correct > fragmentation status of the file when the compression is used. As I wrote, "their size is just limited by the compression code" which results in "128KiB writes". On a "fresh empty Btrfs volume" the compressed extents limited to 128KiB also happen to be pretty physically contiguous, but on a more fragmented free space list they can be more scattered. KOS: Ok, thanks for pointing it out. I have compared the filefrag -v on another btrfs that is not fragmented and see the difference with what is happening on the sluggish one. 5824: 186368.. 186399: 2430093383..2430093414: 32: 2430093414: encoded 5825: 186400.. 186431: 2430093384..2430093415: 32: 2430093415: encoded 5826: 186432.. 186463: 2430093385..2430093416: 32: 2430093416: encoded 5827: 186464.. 186495: 2430093386..2430093417: 32: 2430093417: encoded 5828: 186496.. 186527: 2430093387..2430093418: 32: 2430093418: encoded 5829: 186528.. 186559: 2430093388..2430093419: 32: 2430093419: encoded 5830: 186560.. 186591: 2430093389..2430093420: 32: 2430093420: encoded As I already wrote the main issue here seems to be that we are talking about a "RAID5 with 128KiB writes and a 768KiB stripe size". On MD RAID5 the slowdown because of RMW seems only to be around 30-40%, but it looks like that several back-to-back 128KiB writes get merged by the Linux IO subsystem (not sure whether that's thoroughly legal), and perhaps they get merged by the 3ware firmware only if it has a persistent cache, and maybe your 3ware does not have one, but you have kept your counsel as to that. KOS: No I don't have persistent cache. Only the 512 Mb cache on board of a controller, that is BBU. If I had additional SSD caching on the controller I would have mentioned it. I was also under impression, that in a situation where mostly extra large files will be stored on the massive, the bigger strip size would indeed increase the speed, thus I went with with the 256 Kb strip size. Would I be correct in assuming that the RAID strip size of 128 Kb will be a better choice if one plans to use the BTRFS with compression? thanks, kos -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: Btrfs + compression = slow performance and high cpu usage
> Peter, I don't think the filefrag is showing the correct > fragmentation status of the file when the compression is used. As reported on a previous message the output of 'filefrag -v' which can be used to see what is going on: filefrag /mnt/sde3/testfile /mnt/sde3/testfile: 49287 extents found Most the latter extents are mercifully rather contiguous, their size is just limited by the compression code, here is an extract from 'filefrag -v' from around the middle: 24757: 1321888.. 1321919: 11339579.. 11339610: 32: 11339594: 24758: 1321920.. 1321951: 11339597.. 11339628: 32: 11339611: 24759: 1321952.. 1321983: 11339615.. 11339646: 32: 11339629: 24760: 1321984.. 1322015: 11339632.. 11339663: 32: 11339647: 24761: 1322016.. 1322047: 11339649.. 11339680: 32: 11339664: 24762: 1322048.. 1322079: 11339667.. 11339698: 32: 11339681: 24763: 1322080.. 1322111: 11339686.. 11339717: 32: 11339699: 24764: 1322112.. 1322143: 11339703.. 11339734: 32: 11339718: 24765: 1322144.. 1322175: 11339720.. 11339751: 32: 11339735: 24766: 1322176.. 1322207: 11339737.. 11339768: 32: 11339752: 24767: 1322208.. 1322239: 11339754.. 11339785: 32: 11339769: 24768: 1322240.. 1322271: 11339771.. 11339802: 32: 11339786: 24769: 1322272.. 1322303: 11339789.. 11339820: 32: 11339803: But again this is on a fresh empty Btrfs volume. As I wrote, "their size is just limited by the compression code" which results in "128KiB writes". On a "fresh empty Btrfs volume" the compressed extents limited to 128KiB also happen to be pretty physically contiguous, but on a more fragmented free space list they can be more scattered. As I already wrote the main issue here seems to be that we are talking about a "RAID5 with 128KiB writes and a 768KiB stripe size". On MD RAID5 the slowdown because of RMW seems only to be around 30-40%, but it looks like that several back-to-back 128KiB writes get merged by the Linux IO subsystem (not sure whether that's thoroughly legal), and perhaps they get merged by the 3ware firmware only if it has a persistent cache, and maybe your 3ware does not have one, but you have kept your counsel as to that. My impression is that you read the Btrfs documentation and my replies with a lot less attention than I write them. Some of the things you have done and said make me think that you did not read https://btrfs.wiki.kernel.org/index.php/Compression and 'man 5 btrfs', for example: "How does compression interact with direct IO or COW? Compression does not work with DIO, does work with COW and does not work for NOCOW files. If a file is opened in DIO mode, it will fall back to buffered IO. Are there speed penalties when doing random access to a compressed file? Yes. The compression processes ranges of a file of maximum size 128 KiB and compresses each 4 KiB (or page-sized) block separately." > I am currently defragmenting that mountpoint, ensuring that > everrything is compressed with zlib. Defragmenting the used space might help find more contiguous allocations. > p.s. any other suggestion that might help with the fragmentation > and data allocation. Should I try and rebalance the data on the > drive? Yes, regularly, as that defragments the unused space. -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
RE: Btrfs + compression = slow performance and high cpu usage
> -Original Message- > From: linux-btrfs-ow...@vger.kernel.org [mailto:linux-btrfs- > ow...@vger.kernel.org] On Behalf Of Konstantin V. Gavrilenko > Sent: Tuesday, 1 August 2017 7:58 PM > To: Peter Grandi <p...@btrfs.list.sabi.co.uk> > Cc: Linux fs Btrfs <linux-btrfs@vger.kernel.org> > Subject: Re: Btrfs + compression = slow performance and high cpu usage > > Peter, I don't think the filefrag is showing the correct fragmentation status > of > the file when the compression is used. > At least the one that is installed by default in Ubuntu 16.04 - e2fsprogs | > 1.42.13-1ubuntu1 > > So for example, fragmentation of compressed file is 320 times more then > uncompressed one. > > root@homenas:/mnt/storage/NEW# filefrag test5g-zeroes > test5g-zeroes: 40903 extents found > > root@homenas:/mnt/storage/NEW# filefrag test5g-data > test5g-data: 129 extents found Compressed extents are about 128kb, uncompressed extents are about 128Mb. (can't remember the exact numbers.) I've had trouble with slow filesystems when using compression. The problem seems to go away when removing compression. Paul.
Re: Btrfs + compression = slow performance and high cpu usage
Peter, I don't think the filefrag is showing the correct fragmentation status of the file when the compression is used. At least the one that is installed by default in Ubuntu 16.04 - e2fsprogs | 1.42.13-1ubuntu1 So for example, fragmentation of compressed file is 320 times more then uncompressed one. root@homenas:/mnt/storage/NEW# filefrag test5g-zeroes test5g-zeroes: 40903 extents found root@homenas:/mnt/storage/NEW# filefrag test5g-data test5g-data: 129 extents found I am currently defragmenting that mountpoint, ensuring that everrything is compressed with zlib. # btrfs fi defragment -rv -czlib /mnt/arh-backup my guess is that it will take another 24-36 hours to complete and then I will redo the test to see if that has helped. will keep the list posted. p.s. any other suggestion that might help with the fragmentation and data allocation. Should I try and rebalance the data on the drive? kos - Original Message - From: "Peter Grandi" <p...@btrfs.list.sabi.co.uk> To: "Linux fs Btrfs" <linux-btrfs@vger.kernel.org> Sent: Monday, 31 July, 2017 1:41:07 PM Subject: Re: Btrfs + compression = slow performance and high cpu usage [ ... ] > grep 'model name' /proc/cpuinfo | sort -u > model name : Intel(R) Xeon(R) CPU E5645 @ 2.40GHz Good, contemporary CPU with all accelerations. > The sda device is a hardware RAID5 consisting of 4x8TB drives. [ ... ] > Strip Size : 256 KB So the full RMW data stripe length is 768KiB. > [ ... ] don't see the previously reported behaviour of one of > the kworker consuming 100% of the cputime, but the write speed > difference between the compression ON vs OFF is pretty large. That's weird; of course 'lzo' is a lot cheaper than 'zlib', but in my test the much higher CPU time of the latter was spread across many CPUs, while in your case it wasn't, even if the E5645 has 6 CPUs and can do 12 threads. That seemed to point to some high cost of finding free blocks, that is a very fragmented free list, or something else. > dd if=/dev/sdb of=./testing count=5120 bs=1M status=progress oflag=direct > 5368709120 bytes (5.4 GB, 5.0 GiB) copied, 26.0685 s, 206 MB/s The results with 'oflag=direct' are not relevant, because Btrfs behaves "differently" with that. > mountflags: > (rw,relatime,compress-force=zlib,space_cache=v2,subvolid=5,subvol=/) [ ... ] > dd if=/dev/sdb of=./testing count=5120 bs=1M status=progress conv=fsync > 5368709120 bytes (5.4 GB, 5.0 GiB) copied, 77.4845 s, 69.3 MB/s > mountflags: > (rw,relatime,compress-force=lzo,space_cache=v2,subvolid=5,subvol=/) [ ... ] > dd if=/dev/sdb of=./testing count=5120 bs=1M status=progress conv=fsync > 5368709120 bytes (5.4 GB, 5.0 GiB) copied, 122.321 s, 43.9 MB/s That's pretty good for a RAID5 with 128KiB writes and a 768KiB stripe size, on a 3ware, and looks like that the hw host adapter does not have a persistent cache (battery backed usually). My guess that watching transfer rates and latencies with 'iostat -dk -zyx 1' did not happen. > mountflags: (rw,relatime,space_cache=v2,subvolid=5,subvol=/) [ ... ] > dd if=/dev/sdb of=./testing count=5120 bs=1M status=progress conv=fsync > 5368709120 bytes (5.4 GB, 5.0 GiB) copied, 10.1033 s, 531 MB/s I had mentioned in my previous reply the output of 'filefrag'. That to me seems relevant here, because of RAID5 RMW and maximum extent size with Brfs compression and strip/stripe size. Perhaps redoing the tests with a 128KiB 'bs' *without* compression would be interesting, perhaps even with 'oflag=sync' instead of 'conv=fsync'. It is hard for me to see a speed issue here with Btrfs: for comparison I have done a simple test with a both a 3+1 MD RAID5 set with a 256KiB chunk size and a single block device on "contemporary" 1T/2TB drives, capable of sequential transfer rates of 150-190MB/s: soft# grep -A2 sdb3 /proc/mdstat md127 : active raid5 sde3[4] sdd3[2] sdc3[1] sdb3[0] 729808128 blocks super 1.0 level 5, 256k chunk, algorithm 2 [4/4] [] with compression: soft# mount -t btrfs -o commit=10,compress-force=zlib /dev/md/test5 /mnt/test5 soft# mount -t btrfs -o commit=10,compress-force=zlib /dev/sdg3 /mnt/sdg3 soft# rm -f /mnt/test5/testfile /mnt/sdg3/testfile soft# /usr/bin/time dd iflag=fullblock if=/dev/sda6 of=/mnt/test5/testfile bs=1M count=1 conv=fsync 1+0 records in 1+0 records out 1048576 bytes (10 GB) copied, 94.3605 s, 111 MB/s 0.01user 12.59system 1:34.36elapsed 13%CPU (0avgtext+0avgdata 2932maxresident)k 13042144inputs+20482144outputs (3major+345minor)pagefaults 0swaps soft# /usr/bin/time dd iflag=fullblock if=/dev/sda6 of=/mnt/sdg3/testfile bs=1M count=1 conv=fsync 1+0 records in 1+0 records out 1048576 bytes (10 GB) copied, 93.5885 s, 112 MB/s 0.03user 12.35syst
Re: Btrfs + compression = slow performance and high cpu usage
> [ ... ] It is hard for me to see a speed issue here with > Btrfs: for comparison I have done a simple test with a both a > 3+1 MD RAID5 set with a 256KiB chunk size and a single block > device on "contemporary" 1T/2TB drives, capable of sequential > transfer rates of 150-190MB/s: [ ... ] The figures after this are a bit on the low side because I realized looking at 'vmstat' that the source block device 'sda6' was being a bottleneck, as the host has only 8GiB instead of the 16GiB I misremembered, and also 'sda' is a relatively slow flash SSD that reads are most at around 220MB/s. So I have redone the simple tests with a transfer size of 3GB, which ensures that all reads are from memory cache: with compression: soft# mount -t btrfs -o commit=10,compress-force=zlib /dev/md/test5 /mnt/test5 soft# mount -t btrfs -o commit=10,compress-force=zlib /dev/sdg3 /mnt/sdg3 soft# rm -f /mnt/test5/testfile /mnt/sdg3/testfile soft# /usr/bin/time dd iflag=fullblock if=/dev/sda6 of=/mnt/test5/testfile bs=1M count=3000 conv=fsync 3000+0 records in 3000+0 records out 3145728000 bytes (3.1 GB) copied, 15.8869 s, 198 MB/s 0.00user 2.80system 0:15.88elapsed 17%CPU (0avgtext+0avgdata 3056maxresident)k 0inputs+6148256outputs (0major+346minor)pagefaults 0swaps soft# /usr/bin/time dd iflag=fullblock if=/dev/sda6 of=/mnt/sdg3/testfile bs=1M count=3000 conv=fsync 3000+0 records in 3000+0 records out 3145728000 bytes (3.1 GB) copied, 16.9663 s, 185 MB/s 0.00user 2.61system 0:16.96elapsed 15%CPU (0avgtext+0avgdata 3056maxresident)k 0inputs+6144672outputs (0major+346minor)pagefaults 0swaps soft# btrfs fi df /mnt/test5/ | grep Data Data, single: total=3.00GiB, used=2.28GiB soft# btrfs fi df /mnt/sdg3 | grep Data Data, single: total=3.00GiB, used=2.28GiB soft# filefrag /mnt/test5/testfile /mnt/sdg3/testfile /mnt/test5/testfile: 8811 extents found /mnt/sdg3/testfile: 8759 extents found Slightly weird that with a 3GB size the number of extents is almost double that for the 10GB, but I guess that depends on speed. Then without compression: soft# mount -t btrfs -o commit=10 /dev/md/test5 /mnt/test5 soft# mount -t btrfs -o commit=10 /dev/sdg3 /mnt/sdg3 soft# rm -f /mnt/test5/testfile /mnt/sdg3/testfile soft# /usr/bin/time dd iflag=fullblock if=/dev/sda6 of=/mnt/test5/testfile bs=1M count=3000 conv=fsync 3000+0 records in 3000+0 records out 3145728000 bytes (3.1 GB) copied, 8.06841 s, 390 MB/s 0.00user 3.90system 0:08.80elapsed 44%CPU (0avgtext+0avgdata 2880maxresident)k 0inputs+6153856outputs (0major+345minor)pagefaults 0swaps soft# /usr/bin/time dd iflag=fullblock if=/dev/sda6 of=/mnt/sdg3/testfile bs=1M count=3000 conv=fsync 3000+0 records in 3000+0 records out 3145728000 bytes (3.1 GB) copied, 30.215 s, 104 MB/s 0.00user 4.82system 0:30.93elapsed 15%CPU (0avgtext+0avgdata 2888maxresident)k 0inputs+6152128outputs (0major+347minor)pagefaults 0swaps soft# filefrag /mnt/test5/testfile /mnt/sdg3/testfile /mnt/test5/testfile: 5 extents found /mnt/sdg3/testfile: 3 extents found Also added: soft# rm -f /mnt/test5/testfile /mnt/sdg3/testfile soft# /usr/bin/time dd iflag=fullblock if=/dev/sda6 bs=128k count=3000 | dd iflag=fullblock of=/mnt/test5/testfile bs=128k oflag=sync 3000+0 records in 3000+0 records out 393216000 bytes (393 MB) copied, 160.315 s, 2.5 MB/s 0.02user 0.46system 2:40.31elapsed 0%CPU (0avgtext+0avgdata 1992maxresident)k 0inputs+0outputs (0major+124minor)pagefaults 0swaps 3000+0 records in 3000+0 records out 393216000 bytes (393 MB) copied, 160.365 s, 2.5 MB/s soft# /usr/bin/time dd iflag=fullblock if=/dev/sda6 bs=128k count=3000 | dd iflag=fullblock of=/mnt/sdg3/testfile bs=128k oflag=sync 3000+0 records in 3000+0 records out 393216000 bytes (393 MB) copied, 113.51 s, 3.5 MB/s 0.02user 0.56system 1:53.51elapsed 0%CPU (0avgtext+0avgdata 2156maxresident)k 0inputs+0outputs (0major+120minor)pagefaults 0swaps 3000+0 records in 3000+0 records out 393216000 bytes (393 MB) copied, 113.544 s, 3.5 MB/s soft# filefrag /mnt/test5/testfile /mnt/sdg3/testfile /mnt/test5/testfile: 1 extent found /mnt/sdg3/testfile: 22 extents found soft# rm -f /mnt/test5/testfile /mnt/sdg3/testfile soft# /usr/bin/time dd iflag=fullblock if=/dev/sda6 bs=1M count=1000 | dd iflag=fullblock of=/mnt/test5/testfile bs=1M oflag=sync
Re: Btrfs + compression = slow performance and high cpu usage
[ ... ] > Also added: Feeling very generous :-) today, adding these too: soft# mkfs.btrfs -mraid10 -draid10 -L test5 /dev/sd{b,c,d,e}3 [ ... ] soft# mount -t btrfs -o commit=10,compress-force=zlib /dev/sdb3 /mnt/test5 soft# rm -f /mnt/test5/testfile soft# /usr/bin/time dd iflag=fullblock if=/dev/sda6 of=/mnt/test5/testfile bs=1M count=3000 conv=fsync 3000+0 records in 3000+0 records out 3145728000 bytes (3.1 GB) copied, 14.2166 s, 221 MB/s 0.00user 2.54system 0:14.21elapsed 17%CPU (0avgtext+0avgdata 3056maxresident)k 0inputs+6144768outputs (0major+346minor)pagefaults 0swaps soft# rm -f /mnt/test5/testfile soft# /usr/bin/time dd iflag=fullblock if=/dev/sda6 of=/mnt/test5/testfile bs=128k count=3000 conv=fsync 3000+0 records in 3000+0 records out 393216000 bytes (393 MB) copied, 2.05933 s, 191 MB/s 0.00user 0.32system 0:02.06elapsed 15%CPU (0avgtext+0avgdata 1996maxresident)k 0inputs+772512outputs (0major+124minor)pagefaults 0swaps soft# rm -f /mnt/test5/testfile soft# /usr/bin/time dd iflag=fullblock if=/dev/sda6 bs=1M count=1000 | dd iflag=fullblock of=/mnt/test5/testfile bs=1M oflag=sync 1000+0 records in 1000+0 records out 1048576000 bytes (1.0 GB) copied, 60.6019 s, 17.3 MB/s 0.01user 1.04system 1:00.60elapsed 1%CPU (0avgtext+0avgdata 2888maxresident)k 0inputs+0outputs (0major+348minor)pagefaults 0swaps 1000+0 records in 1000+0 records out 1048576000 bytes (1.0 GB) copied, 60.4116 s, 17.4 MB/s soft# rm -f /mnt/test5/testfile soft# /usr/bin/time dd iflag=fullblock if=/dev/sda6 bs=128k count=3000 | dd iflag=fullblock of=/mnt/test5/testfile bs=128k oflag=sync 3000+0 records in 3000+0 records out 393216000 bytes (393 MB) copied, 148.04 s, 2.7 MB/s 0.00user 0.62system 2:28.04elapsed 0%CPU (0avgtext+0avgdata 1996maxresident)k 0inputs+0outputs (0major+125minor)pagefaults 0swaps 3000+0 records in 3000+0 records out 393216000 bytes (393 MB) copied, 148.083 s, 2.7 MB/s soft# sysctl vm/drop_caches=3 vm.drop_caches = 3 soft# /usr/bin/time dd iflag=fullblock if=/mnt/test5/testfile bs=128k count=3000 of=/dev/zero 3000+0 records in 3000+0 records out 393216000 bytes (393 MB) copied, 1.09729 s, 358 MB/s 0.00user 0.24system 0:01.10elapsed 23%CPU (0avgtext+0avgdata 2164maxresident)k 459768inputs+0outputs (3major+121minor)pagefaults 0swaps -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: Btrfs + compression = slow performance and high cpu usage
[ ... ] > grep 'model name' /proc/cpuinfo | sort -u > model name : Intel(R) Xeon(R) CPU E5645 @ 2.40GHz Good, contemporary CPU with all accelerations. > The sda device is a hardware RAID5 consisting of 4x8TB drives. [ ... ] > Strip Size : 256 KB So the full RMW data stripe length is 768KiB. > [ ... ] don't see the previously reported behaviour of one of > the kworker consuming 100% of the cputime, but the write speed > difference between the compression ON vs OFF is pretty large. That's weird; of course 'lzo' is a lot cheaper than 'zlib', but in my test the much higher CPU time of the latter was spread across many CPUs, while in your case it wasn't, even if the E5645 has 6 CPUs and can do 12 threads. That seemed to point to some high cost of finding free blocks, that is a very fragmented free list, or something else. > dd if=/dev/sdb of=./testing count=5120 bs=1M status=progress oflag=direct > 5368709120 bytes (5.4 GB, 5.0 GiB) copied, 26.0685 s, 206 MB/s The results with 'oflag=direct' are not relevant, because Btrfs behaves "differently" with that. > mountflags: > (rw,relatime,compress-force=zlib,space_cache=v2,subvolid=5,subvol=/) [ ... ] > dd if=/dev/sdb of=./testing count=5120 bs=1M status=progress conv=fsync > 5368709120 bytes (5.4 GB, 5.0 GiB) copied, 77.4845 s, 69.3 MB/s > mountflags: > (rw,relatime,compress-force=lzo,space_cache=v2,subvolid=5,subvol=/) [ ... ] > dd if=/dev/sdb of=./testing count=5120 bs=1M status=progress conv=fsync > 5368709120 bytes (5.4 GB, 5.0 GiB) copied, 122.321 s, 43.9 MB/s That's pretty good for a RAID5 with 128KiB writes and a 768KiB stripe size, on a 3ware, and looks like that the hw host adapter does not have a persistent cache (battery backed usually). My guess that watching transfer rates and latencies with 'iostat -dk -zyx 1' did not happen. > mountflags: (rw,relatime,space_cache=v2,subvolid=5,subvol=/) [ ... ] > dd if=/dev/sdb of=./testing count=5120 bs=1M status=progress conv=fsync > 5368709120 bytes (5.4 GB, 5.0 GiB) copied, 10.1033 s, 531 MB/s I had mentioned in my previous reply the output of 'filefrag'. That to me seems relevant here, because of RAID5 RMW and maximum extent size with Brfs compression and strip/stripe size. Perhaps redoing the tests with a 128KiB 'bs' *without* compression would be interesting, perhaps even with 'oflag=sync' instead of 'conv=fsync'. It is hard for me to see a speed issue here with Btrfs: for comparison I have done a simple test with a both a 3+1 MD RAID5 set with a 256KiB chunk size and a single block device on "contemporary" 1T/2TB drives, capable of sequential transfer rates of 150-190MB/s: soft# grep -A2 sdb3 /proc/mdstat md127 : active raid5 sde3[4] sdd3[2] sdc3[1] sdb3[0] 729808128 blocks super 1.0 level 5, 256k chunk, algorithm 2 [4/4] [] with compression: soft# mount -t btrfs -o commit=10,compress-force=zlib /dev/md/test5 /mnt/test5 soft# mount -t btrfs -o commit=10,compress-force=zlib /dev/sdg3 /mnt/sdg3 soft# rm -f /mnt/test5/testfile /mnt/sdg3/testfile soft# /usr/bin/time dd iflag=fullblock if=/dev/sda6 of=/mnt/test5/testfile bs=1M count=1 conv=fsync 1+0 records in 1+0 records out 1048576 bytes (10 GB) copied, 94.3605 s, 111 MB/s 0.01user 12.59system 1:34.36elapsed 13%CPU (0avgtext+0avgdata 2932maxresident)k 13042144inputs+20482144outputs (3major+345minor)pagefaults 0swaps soft# /usr/bin/time dd iflag=fullblock if=/dev/sda6 of=/mnt/sdg3/testfile bs=1M count=1 conv=fsync 1+0 records in 1+0 records out 1048576 bytes (10 GB) copied, 93.5885 s, 112 MB/s 0.03user 12.35system 1:33.59elapsed 13%CPU (0avgtext+0avgdata 2940maxresident)k 13042144inputs+20482400outputs (3major+346minor)pagefaults 0swaps soft# filefrag /mnt/test5/testfile /mnt/sdg3/testfile /mnt/test5/testfile: 48945 extents found /mnt/sdg3/testfile: 49029 extents found soft# btrfs fi df /mnt/test5/ | grep Data Data, single: total=7.00GiB, used=6.55GiB soft# btrfs fi df /mnt/sdg3 | grep Data Data, single: total=7.00GiB, used=6.55GiB soft# sysctl vm/drop_caches=3 vm.drop_caches = 3 soft# /usr/bin/time dd iflag=fullblock if=/mnt/test5/testfile bs=1M count=1 of=/dev/zero 1+0 records in 1+0 records out 1048576 bytes (10 GB) copied, 23.2975 s, 450 MB/s 0.01user 7.59system 0:23.32elapsed 32%CPU (0avgtext+0avgdata 2932maxresident)k 13759624inputs+0outputs (3major+344minor)pagefaults 0swaps soft# sysctl vm/drop_caches=3 vm.drop_caches = 3 soft# /usr/bin/time dd iflag=fullblock if=/mnt/sdg3/testfile bs=1M count=1 of=/dev/zero 1+0 records in 1+0 records out 1048576 bytes (10 GB) copied, 35.0032 s, 300 MB/s 0.01user 8.46system 0:35.03elapsed 24%CPU (0avgtext+0avgdata 2924maxresident)k 13750568inputs+0outputs (3major+345minor)pagefaults 0swaps and
Re: Btrfs + compression = slow performance and high cpu usage
all 0.00 0.00 4.84 5.09 0.00 90.08 14:31:45all 0.17 0.00 4.67 4.75 0.00 90.42 14:31:46all 0.00 0.00 4.60 3.76 0.00 91.64 14:31:47all 0.08 0.00 5.07 3.66 0.00 91.18 14:31:48all 0.00 0.00 5.01 3.68 0.00 91.31 14:31:49all 0.00 0.00 4.76 3.68 0.00 91.56 14:31:50all 0.08 0.00 4.59 3.59 0.00 91.73 14:31:51all 0.00 0.00 2.67 1.92 0.00 95.41 - Original Message - From: "Peter Grandi" <p...@btrfs.list.sabi.co.uk> To: "Linux fs Btrfs" <linux-btrfs@vger.kernel.org> Sent: Friday, 28 July, 2017 8:08:47 PM Subject: Re: Btrfs + compression = slow performance and high cpu usage > I am stuck with a problem of btrfs slow performance when using > compression. [ ... ] That to me looks like an issue with speed, not performance, and in particular with PEBCAK issues. As to high CPU usage, when you find a way to do both compression and checksumming without using much CPU time, please send patches urgently :-). In your case the increase in CPU time is bizarre. I have the Ubuntu 4.4 "lts-xenial" kernel and what you report does not happen here (with a few little changes): soft# grep 'model name' /proc/cpuinfo | sort -u model name : AMD FX(tm)-6100 Six-Core Processor soft# cpufreq-info | grep 'current CPU frequency' current CPU frequency is 3.30 GHz (asserted by call to hardware). current CPU frequency is 3.30 GHz (asserted by call to hardware). current CPU frequency is 3.30 GHz (asserted by call to hardware). current CPU frequency is 3.30 GHz (asserted by call to hardware). current CPU frequency is 3.30 GHz (asserted by call to hardware). current CPU frequency is 3.30 GHz (asserted by call to hardware). soft# lsscsi | grep 'sd[ae]' [0:0:0:0]diskATA HFS256G32MNB-220 3L00 /dev/sda [5:0:0:0]diskATA ST2000DM001-1CH1 CC44 /dev/sde soft# mkfs.btrfs -f /dev/sde3 [ ... ] soft# mount -t btrfs -o discard,autodefrag,compress=lzo,compress-force,commit=10 /dev/sde3 /mnt/sde3 soft# df /dev/sda6 /mnt/sde3 Filesystem 1M-blocks Used Available Use% Mounted on /dev/sda6 90048 76046 14003 85% / /dev/sde3 23756819235501 1% /mnt/sde3 The above is useful context information that was "amazingly" omitted from your reported. In dmesg I see (not the "force zlib compression"): [327730.917285] BTRFS info (device sde3): turning on discard [327730.917294] BTRFS info (device sde3): enabling auto defrag [327730.917300] BTRFS info (device sde3): setting 8 feature flag [327730.917304] BTRFS info (device sde3): force zlib compression [327730.917313] BTRFS info (device sde3): disk space caching is enabled [327730.917315] BTRFS: has skinny extents [327730.917317] BTRFS: flagging fs with big metadata feature [327730.920740] BTRFS: creating UUID tree and the result is: soft# pv -tpreb /dev/sda6 | time dd iflag=fullblock of=/mnt/sde3/testfile bs=1M count=1 oflag=direct 1+0 records in17MB/s] [==>] 11% ETA 0:15:06 1+0 records out 1048576 bytes (10 GB) copied, 112.845 s, 92.9 MB/s 0.05user 9.93system 1:53.20elapsed 8%CPU (0avgtext+0avgdata 3016maxresident)k 120inputs+20496000outputs (1major+346minor)pagefaults 0swaps 9.77GB 0:01:53 [88.3MB/s] [==>] 11% soft# btrfs fi df /mnt/sde3/ Data, single: total=10.01GiB, used=9.77GiB System, DUP: total=8.00MiB, used=16.00KiB Metadata, DUP: total=1.00GiB, used=11.66MiB GlobalReserve, single: total=16.00MiB, used=0.00B As it was running system CPU time was under 20% of one CPU: top - 18:57:29 up 3 days, 19:27, 4 users, load average: 5.44, 2.82, 1.45 Tasks: 325 total, 1 running, 324 sleeping, 0 stopped, 0 zombie %Cpu0 : 0.0 us, 2.3 sy, 0.0 ni, 91.3 id, 6.3 wa, 0.0 hi, 0.0 si, 0.0 st %Cpu1 : 0.0 us, 1.3 sy, 0.0 ni, 78.5 id, 20.2 wa, 0.0 hi, 0.0 si, 0.0 st %Cpu2 : 0.3 us, 5.8 sy, 0.0 ni, 81.0 id, 12.5 wa, 0.0 hi, 0.3 si, 0.0 st %Cpu3 : 0.3 us, 3.4 sy, 0.0 ni, 91.9 id, 4.4 wa, 0.0 hi, 0.0 si, 0.0 st %Cpu4 : 0.3 us, 10.6 sy, 0.0 ni, 55.4 id, 33.7 wa, 0.0 hi, 0.0 si, 0.0 st %Cpu5 : 0.0 us, 0.3 sy, 0.0 ni, 99.7 id, 0.0 wa, 0.0 hi, 0.0 si, 0.0 st KiB Mem: 8120660 total, 5162236 used, 2958424 free, 4440100 buffers KiB Swap:0 total,0 used,0 free. 351848 cached Mem PID PPID USER PR NIVIRTRESDATA %CPU %MEM TIME+ TTY COMMAND 21047 21046 root 20 08872 26161364 12.9 0.0 0:02.31 pts/3dd iflag=fullblo+ 21045 3535 root 20 07928 1948
Re: Btrfs + compression = slow performance and high cpu usage
In addition to my previous "it does not happen here" comment, if someone is reading this thread, there are some other interesting details: > When the compression is turned off, I am able to get the > maximum 500-600 mb/s write speed on this disk (raid array) > with minimal cpu usage. No details on whether it is a parity RAID or not. > btrfs device usage /mnt/arh-backup1/ > /dev/sda, ID: 2 >Device size:21.83TiB >Device slack: 0.00B >Data,single: 9.29TiB >Metadata,single:46.00GiB >System,single: 32.00MiB >Unallocated:12.49TiB That's exactly 24TB of "Device size", of which around 45% are used, and the string "backup" may suggest that the content is backups, which may indicate a very fragmented freespace. Of course compression does not help with that, in my freshly created Btrfs volume I get as expected: soft# umount /mnt/sde3 soft# mount -t btrfs -o commit=10 /dev/sde3 /mnt/sde3 soft# /usr/bin/time dd iflag=fullblock if=/dev/sda6 of=/mnt/sde3/testfile bs=1M count=1 conv=fsync 1+0 records in 1+0 records out 1048576 bytes (10 GB) copied, 103.747 s, 101 MB/s 0.00user 11.56system 1:44.86elapsed 11%CPU (0avgtext+0avgdata 3072maxresident)k 20480672inputs+20498272outputs (1major+349minor)pagefaults 0swaps soft# filefrag /mnt/sde3/testfile /mnt/sde3/testfile: 11 extents found versus: soft# umount /mnt/sde3 soft# mount -t btrfs -o commit=10,compress=lzo,compress-force /dev/sde3 /mnt/sde3 soft# /usr/bin/time dd iflag=fullblock if=/dev/sda6 of=/mnt/sde3/testfile bs=1M count=1 conv=fsync 1+0 records in 1+0 records out 1048576 bytes (10 GB) copied, 109.051 s, 96.2 MB/s 0.02user 13.03system 1:49.49elapsed 11%CPU (0avgtext+0avgdata 3068maxresident)k 20494784inputs+20492320outputs (1major+347minor)pagefaults 0swaps soft# filefrag /mnt/sde3/testfile /mnt/sde3/testfile: 49287 extents found Most the latter extents are mercifully rather contiguous, their size is just limited by the compression code, here is an extract from 'filefrag -v' from around the middle: 24757: 1321888.. 1321919: 11339579.. 11339610: 32: 11339594: 24758: 1321920.. 1321951: 11339597.. 11339628: 32: 11339611: 24759: 1321952.. 1321983: 11339615.. 11339646: 32: 11339629: 24760: 1321984.. 1322015: 11339632.. 11339663: 32: 11339647: 24761: 1322016.. 1322047: 11339649.. 11339680: 32: 11339664: 24762: 1322048.. 1322079: 11339667.. 11339698: 32: 11339681: 24763: 1322080.. 1322111: 11339686.. 11339717: 32: 11339699: 24764: 1322112.. 1322143: 11339703.. 11339734: 32: 11339718: 24765: 1322144.. 1322175: 11339720.. 11339751: 32: 11339735: 24766: 1322176.. 1322207: 11339737.. 11339768: 32: 11339752: 24767: 1322208.. 1322239: 11339754.. 11339785: 32: 11339769: 24768: 1322240.. 1322271: 11339771.. 11339802: 32: 11339786: 24769: 1322272.. 1322303: 11339789.. 11339820: 32: 11339803: But again this is on a fresh empty Btrfs volume. -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: Btrfs + compression = slow performance and high cpu usage
On Fri, Jul 28, 2017 at 06:20:14PM +, William Muriithi wrote: > Hi Roman, > > > autodefrag > > This sure sounded like a good thing to enable? on paper? right?... > > The moment you see anything remotely weird about btrfs, this is the first > thing you have to disable and retest without. Oh wait, the first would be > qgroups, this one is second. > > What's the problem with autodefrag? I am also using it, so you caught my > attention when you implied that it shouldn't be used. According to docs, it > seem like one of the very mature feature of the filesystem. See below for > the doc I am referring to > > https://btrfs.wiki.kernel.org/index.php/Status > > I am using it as I assumed it could prevent the filesystem being too > fragmented long term, but never thought there was price to pay for using it It introduces additional I/O on writes, as it modifies a small area surrounding any write or cluster of writes. I'm not aware of it causing massive slowdowns, in the way the qgroups does in some situations. If your system is already marginal in terms of being able to support the I/O required, then turning on autodefrag will make things worse (but you may be heading for _much_ worse performance in the future as the FS becomes more fragmented -- depending on your write patterns and use case). Hugo. -- Hugo Mills | Great oxymorons of the world, no. 6: hugo@... carfax.org.uk | Mature Student http://carfax.org.uk/ | PGP: E2AB1DE4 | signature.asc Description: Digital signature
RE: Btrfs + compression = slow performance and high cpu usage
Hi Roman, > autodefrag This sure sounded like a good thing to enable? on paper? right?... The moment you see anything remotely weird about btrfs, this is the first thing you have to disable and retest without. Oh wait, the first would be qgroups, this one is second. What's the problem with autodefrag? I am also using it, so you caught my attention when you implied that it shouldn't be used. According to docs, it seem like one of the very mature feature of the filesystem. See below for the doc I am referring to https://btrfs.wiki.kernel.org/index.php/Status I am using it as I assumed it could prevent the filesystem being too fragmented long term, but never thought there was price to pay for using it Regards, William -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: Btrfs + compression = slow performance and high cpu usage
> I am stuck with a problem of btrfs slow performance when using > compression. [ ... ] That to me looks like an issue with speed, not performance, and in particular with PEBCAK issues. As to high CPU usage, when you find a way to do both compression and checksumming without using much CPU time, please send patches urgently :-). In your case the increase in CPU time is bizarre. I have the Ubuntu 4.4 "lts-xenial" kernel and what you report does not happen here (with a few little changes): soft# grep 'model name' /proc/cpuinfo | sort -u model name : AMD FX(tm)-6100 Six-Core Processor soft# cpufreq-info | grep 'current CPU frequency' current CPU frequency is 3.30 GHz (asserted by call to hardware). current CPU frequency is 3.30 GHz (asserted by call to hardware). current CPU frequency is 3.30 GHz (asserted by call to hardware). current CPU frequency is 3.30 GHz (asserted by call to hardware). current CPU frequency is 3.30 GHz (asserted by call to hardware). current CPU frequency is 3.30 GHz (asserted by call to hardware). soft# lsscsi | grep 'sd[ae]' [0:0:0:0]diskATA HFS256G32MNB-220 3L00 /dev/sda [5:0:0:0]diskATA ST2000DM001-1CH1 CC44 /dev/sde soft# mkfs.btrfs -f /dev/sde3 [ ... ] soft# mount -t btrfs -o discard,autodefrag,compress=lzo,compress-force,commit=10 /dev/sde3 /mnt/sde3 soft# df /dev/sda6 /mnt/sde3 Filesystem 1M-blocks Used Available Use% Mounted on /dev/sda6 90048 76046 14003 85% / /dev/sde3 23756819235501 1% /mnt/sde3 The above is useful context information that was "amazingly" omitted from your reported. In dmesg I see (not the "force zlib compression"): [327730.917285] BTRFS info (device sde3): turning on discard [327730.917294] BTRFS info (device sde3): enabling auto defrag [327730.917300] BTRFS info (device sde3): setting 8 feature flag [327730.917304] BTRFS info (device sde3): force zlib compression [327730.917313] BTRFS info (device sde3): disk space caching is enabled [327730.917315] BTRFS: has skinny extents [327730.917317] BTRFS: flagging fs with big metadata feature [327730.920740] BTRFS: creating UUID tree and the result is: soft# pv -tpreb /dev/sda6 | time dd iflag=fullblock of=/mnt/sde3/testfile bs=1M count=1 oflag=direct 1+0 records in17MB/s] [==>] 11% ETA 0:15:06 1+0 records out 1048576 bytes (10 GB) copied, 112.845 s, 92.9 MB/s 0.05user 9.93system 1:53.20elapsed 8%CPU (0avgtext+0avgdata 3016maxresident)k 120inputs+20496000outputs (1major+346minor)pagefaults 0swaps 9.77GB 0:01:53 [88.3MB/s] [==>] 11% soft# btrfs fi df /mnt/sde3/ Data, single: total=10.01GiB, used=9.77GiB System, DUP: total=8.00MiB, used=16.00KiB Metadata, DUP: total=1.00GiB, used=11.66MiB GlobalReserve, single: total=16.00MiB, used=0.00B As it was running system CPU time was under 20% of one CPU: top - 18:57:29 up 3 days, 19:27, 4 users, load average: 5.44, 2.82, 1.45 Tasks: 325 total, 1 running, 324 sleeping, 0 stopped, 0 zombie %Cpu0 : 0.0 us, 2.3 sy, 0.0 ni, 91.3 id, 6.3 wa, 0.0 hi, 0.0 si, 0.0 st %Cpu1 : 0.0 us, 1.3 sy, 0.0 ni, 78.5 id, 20.2 wa, 0.0 hi, 0.0 si, 0.0 st %Cpu2 : 0.3 us, 5.8 sy, 0.0 ni, 81.0 id, 12.5 wa, 0.0 hi, 0.3 si, 0.0 st %Cpu3 : 0.3 us, 3.4 sy, 0.0 ni, 91.9 id, 4.4 wa, 0.0 hi, 0.0 si, 0.0 st %Cpu4 : 0.3 us, 10.6 sy, 0.0 ni, 55.4 id, 33.7 wa, 0.0 hi, 0.0 si, 0.0 st %Cpu5 : 0.0 us, 0.3 sy, 0.0 ni, 99.7 id, 0.0 wa, 0.0 hi, 0.0 si, 0.0 st KiB Mem: 8120660 total, 5162236 used, 2958424 free, 4440100 buffers KiB Swap:0 total,0 used,0 free. 351848 cached Mem PID PPID USER PR NIVIRTRESDATA %CPU %MEM TIME+ TTY COMMAND 21047 21046 root 20 08872 26161364 12.9 0.0 0:02.31 pts/3dd iflag=fullblo+ 21045 3535 root 20 07928 1948 460 12.3 0.0 0:00.72 pts/3pv -tpreb /dev/s+ 21019 2 root 20 0 0 0 0 1.3 0.0 0:42.88 ? [kworker/u16:1] Of course "oflag=direct" is a rather "optimistic" option in this context, so I tried again with something more sensible: soft# pv -tpreb /dev/sda6 | time dd iflag=fullblock of=/mnt/sde3/testfile bs=1M count=1 conv=fsync 1+0 records in.4MB/s] [==>] 11% ETA 0:14:41 1+0 records out 1048576 bytes (10 GB) copied, 110.523 s, 94.9 MB/s 0.03user 8.94system 1:50.71elapsed 8%CPU (0avgtext+0avgdata 3024maxresident)k 136inputs+20499648outputs (1major+348minor)pagefaults 0swaps 9.77GB 0:01:50 [90.3MB/s] [==>] 11% soft# btrfs fi df /mnt/sde3/ Data, single: total=7.01GiB, used=6.35GiB System, DUP: total=8.00MiB, used=16.00KiB Metadata, DUP: total=1.00GiB, used=15.81MiB GlobalReserve,
Re: Btrfs + compression = slow performance and high cpu usage
On Fri, 28 Jul 2017 17:40:50 +0100 (BST) "Konstantin V. Gavrilenko"wrote: > Hello list, > > I am stuck with a problem of btrfs slow performance when using compression. > > when the compress-force=lzo mount flag is enabled, the performance drops to > 30-40 mb/s and one of the btrfs processes utilises 100% cpu time. > mount options: btrfs > relatime,discard,autodefrag,compress=lzo,compress-force,space_cache=v2,commit=10 It does not work like that, you need to set compress-force=lzo (and remove compress=). With your setup I believe you currently use compress-force[=zlib](default), overriding compress=lzo, since it's later in the options order. Secondly, > autodefrag This sure sounded like a good thing to enable? on paper? right?... The moment you see anything remotely weird about btrfs, this is the first thing you have to disable and retest without. Oh wait, the first would be qgroups, this one is second. Finally, what is the reasoning behind "commit=10", and did you check with the default value of 30? -- With respect, Roman -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html