Re: [PATCH v5 2/5] lib: Add zstd modules
On Fri, Aug 11, 2017 at 09:20:10AM -0400, Chris Mason wrote: > > > On 08/10/2017 03:25 PM, Hugo Mills wrote: > > On Thu, Aug 10, 2017 at 01:41:21PM -0400, Chris Mason wrote: > >> On 08/10/2017 04:30 AM, Eric Biggers wrote: > >>> > >>> Theses benchmarks are misleading because they compress the whole file as a > >>> single stream without resetting the dictionary, which isn't how data will > >>> typically be compressed in kernel mode. With filesystem compression the > >>> data > >>> has to be divided into small chunks that can each be decompressed > >>> independently. > >>> That eliminates one of the primary advantages of Zstandard (support for > >>> large > >>> dictionary sizes). > >> > >> I did btrfs benchmarks of kernel trees and other normal data sets as > >> well. The numbers were in line with what Nick is posting here. > >> zstd is a big win over both lzo and zlib from a btrfs point of view. > >> > >> It's true Nick's patches only support a single compression level in > >> btrfs, but that's because btrfs doesn't have a way to pass in the > >> compression ratio. It could easily be a mount option, it was just > >> outside the scope of Nick's initial work. > > > > Could we please not add more mount options? I get that they're easy > > to implement, but it's a very blunt instrument. What we tend to see > > (with both nodatacow and compress) is people using the mount options, > > then asking for exceptions, discovering that they can't do that, and > > then falling back to doing it with attributes or btrfs properties. > > Could we just start with btrfs properties this time round, and cut out > > the mount option part of this cycle. > > > > In the long run, it'd be great to see most of the btrfs-specific > > mount options get deprecated and ultimately removed entirely, in > > favour of attributes/properties, where feasible. > > > > It's a good point, and as was commented later down I'd just do mount -o > compress=zstd:3 or something. We've solved that already, here's a proposed patch that extends current mount options, https://www.mail-archive.com/linux-btrfs@vger.kernel.org/msg66248.html and a patch that could be backported to older kernels so the new mount options do not break mounts on older kernels with the new syntax https://patchwork.kernel.org/patch/9845697/ -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH v5 2/5] lib: Add zstd modules
On 08/10/2017 03:25 PM, Hugo Mills wrote: On Thu, Aug 10, 2017 at 01:41:21PM -0400, Chris Mason wrote: On 08/10/2017 04:30 AM, Eric Biggers wrote: Theses benchmarks are misleading because they compress the whole file as a single stream without resetting the dictionary, which isn't how data will typically be compressed in kernel mode. With filesystem compression the data has to be divided into small chunks that can each be decompressed independently. That eliminates one of the primary advantages of Zstandard (support for large dictionary sizes). I did btrfs benchmarks of kernel trees and other normal data sets as well. The numbers were in line with what Nick is posting here. zstd is a big win over both lzo and zlib from a btrfs point of view. It's true Nick's patches only support a single compression level in btrfs, but that's because btrfs doesn't have a way to pass in the compression ratio. It could easily be a mount option, it was just outside the scope of Nick's initial work. Could we please not add more mount options? I get that they're easy to implement, but it's a very blunt instrument. What we tend to see (with both nodatacow and compress) is people using the mount options, then asking for exceptions, discovering that they can't do that, and then falling back to doing it with attributes or btrfs properties. Could we just start with btrfs properties this time round, and cut out the mount option part of this cycle. In the long run, it'd be great to see most of the btrfs-specific mount options get deprecated and ultimately removed entirely, in favour of attributes/properties, where feasible. It's a good point, and as was commented later down I'd just do mount -o compress=zstd:3 or something. But I do prefer properties in general for this. My big point was just that next step is outside of Nick's scope. -chris -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH v5 2/5] lib: Add zstd modules
On 2017-08-10 15:25, Hugo Mills wrote: On Thu, Aug 10, 2017 at 01:41:21PM -0400, Chris Mason wrote: On 08/10/2017 04:30 AM, Eric Biggers wrote: Theses benchmarks are misleading because they compress the whole file as a single stream without resetting the dictionary, which isn't how data will typically be compressed in kernel mode. With filesystem compression the data has to be divided into small chunks that can each be decompressed independently. That eliminates one of the primary advantages of Zstandard (support for large dictionary sizes). I did btrfs benchmarks of kernel trees and other normal data sets as well. The numbers were in line with what Nick is posting here. zstd is a big win over both lzo and zlib from a btrfs point of view. It's true Nick's patches only support a single compression level in btrfs, but that's because btrfs doesn't have a way to pass in the compression ratio. It could easily be a mount option, it was just outside the scope of Nick's initial work. Could we please not add more mount options? I get that they're easy to implement, but it's a very blunt instrument. What we tend to see (with both nodatacow and compress) is people using the mount options, then asking for exceptions, discovering that they can't do that, and then falling back to doing it with attributes or btrfs properties. Could we just start with btrfs properties this time round, and cut out the mount option part of this cycle. AFAIUI, the intent is to extend the compression type specification for both the mount options and the property, not to add a new mount option. I think we all agree that `mount -o compress=zstd3` is a lot better than `mount -o compress=zstd,compresslevel=3`. In the long run, it'd be great to see most of the btrfs-specific mount options get deprecated and ultimately removed entirely, in favour of attributes/properties, where feasible. Are properties set on the root subvolume inherited properly? Because unless they are, we can't get the same semantics. Two other counter arguments on completely removing BTRFS-specific mount options: 1. It's a lot easier and a lot more clearly defined to change things that affect global behavior of the FS by a remount than having to iterate everything in the FS to update properties. If I'm disabling autodefrag, I'd much rather just `mount -o remount,noautodefrag` than `find / -xdev -exec btrfs property set \{\} autodefrag false`, as the first will take effect for everything simultaneously and run exponentially quicker. 2. There are some things that don't make sense as per-object settings or are otherwise nonsensical on objects. Many, but not all, of the BTRFS specific mount options fall into this category IMO, with the notable exception of compress[-force], [no]autodefrag, [no]datacow, and [no]datasum. Some other options do make sense as properties of the filesystem (commit, flushoncommit, {inode,space}_cache, max_inline, metadata_ratio, [no]ssd, and [no]treelog are such options), but many are one-off options that affect behavior on mount (like skip_balance, clear_cache, nologreplay, norecovery, usebbackuproot, and subvol). -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH v5 2/5] lib: Add zstd modules
On Thu, Aug 10, 2017 at 01:41:21PM -0400, Chris Mason wrote: > On 08/10/2017 04:30 AM, Eric Biggers wrote: > > > >Theses benchmarks are misleading because they compress the whole file as a > >single stream without resetting the dictionary, which isn't how data will > >typically be compressed in kernel mode. With filesystem compression the data > >has to be divided into small chunks that can each be decompressed > >independently. > >That eliminates one of the primary advantages of Zstandard (support for large > >dictionary sizes). > > I did btrfs benchmarks of kernel trees and other normal data sets as > well. The numbers were in line with what Nick is posting here. > zstd is a big win over both lzo and zlib from a btrfs point of view. > > It's true Nick's patches only support a single compression level in > btrfs, but that's because btrfs doesn't have a way to pass in the > compression ratio. It could easily be a mount option, it was just > outside the scope of Nick's initial work. Could we please not add more mount options? I get that they're easy to implement, but it's a very blunt instrument. What we tend to see (with both nodatacow and compress) is people using the mount options, then asking for exceptions, discovering that they can't do that, and then falling back to doing it with attributes or btrfs properties. Could we just start with btrfs properties this time round, and cut out the mount option part of this cycle. In the long run, it'd be great to see most of the btrfs-specific mount options get deprecated and ultimately removed entirely, in favour of attributes/properties, where feasible. Hugo. -- Hugo Mills | Klytus! Are your men on the right pills? Maybe you hugo@... carfax.org.uk | should execute their trainer! http://carfax.org.uk/ | PGP: E2AB1DE4 | Ming the Merciless, Flash Gordon signature.asc Description: Digital signature
Re: [PATCH v5 2/5] lib: Add zstd modules
On 8/10/17, 10:48 AM, "Austin S. Hemmelgarn"wrote: >On 2017-08-10 13:24, Eric Biggers wrote: >>On Thu, Aug 10, 2017 at 07:32:18AM -0400, Austin S. Hemmelgarn wrote: >>>On 2017-08-10 04:30, Eric Biggers wrote: On Wed, Aug 09, 2017 at 07:35:53PM -0700, Nick Terrell wrote: > > It can compress at speeds approaching lz4, and quality approaching lzma. Well, for a very loose definition of "approaching", and certainly not at the same time. I doubt there's a use case for using the highest compression levels in kernel mode --- especially the ones using zstd_opt.h. >>> Large data-sets with WORM access patterns and infrequent writes >>> immediately come to mind as a use case for the highest compression >>> level. >>> >>> As a more specific example, the company I work for has a very large >>> amount of documentation, and we keep all old versions. This is all >>> stored on a file server which is currently using BTRFS. Once a >>> document is written, it's almost never rewritten, so write >>> performance only matters for the first write. However, they're read >>> back pretty frequently, so we need good read performance. As of >>> right now, the system is set to use LZO compression by default, and >>> then when a new document is added, the previous version of that >>> document gets re-compressed using zlib compression, which actually >>> results in pretty significant space savings most of the time. I >>> would absolutely love to use zstd compression with this system with >>> the highest compression level, because most people don't care how >>> long it takes to write the file out, but they do care how long it >>> takes to read a file (even if it's an older version). >> >> This may be a reasonable use case, but note this cannot just be the regular >> "zstd" compression setting, since filesystem compression by default must >> provide >> reasonable performance for many different access patterns. See the patch in >> this series which actually adds zstd compression to btrfs; it only uses >> level 1. >> I do not see a patch which adds a higher compression mode. It would need to >> be >> a special setting like "zstdhc" that users could opt-in to on specific >> directories. It also would need to be compared to simply compressing in >> userspace. In many cases compressing in userspace is probably the better >> solution for the use case in question because it works on any filesystem, >> allows >> using any compression algorithm, and if random access is not needed it is >> possible to compress each file as a single stream (like a .xz file), which >> produces a much better compression ratio than the block-by-block compression >> that filesystems have to use. > There has been discussion as well as (I think) initial patches merged > for support of specifying the compression level for algorithms which > support multiple compression levels in BTRFS. I was actually under the > impression that we had decided to use level 3 as the default for zstd, > but that apparently isn't the case, and with the benchmark issues, it > may not be once proper benchmarks are run. There are some initial patches to add compression levels to BtrFS [1]. Once it's ready, we can add compression levels to zstd. The default compression level in the current patch is 3. [1] https://lkml.kernel.org/r/20170724172939.24527-1-dste...@suse.com N�r��yb�X��ǧv�^�){.n�+{�n�߲)w*jg����ݢj/���z�ޖ��2�ޙ&�)ߡ�a�����G���h��j:+v���w��٥
Re: [PATCH v5 2/5] lib: Add zstd modules
On 8/10/17, 1:30 AM, "Eric Biggers"wrote: > On Wed, Aug 09, 2017 at 07:35:53PM -0700, Nick Terrell wrote: >> >> It can compress at speeds approaching lz4, and quality approaching lzma. > > Well, for a very loose definition of "approaching", and certainly not at the > same time. I doubt there's a use case for using the highest compression > levels > in kernel mode --- especially the ones using zstd_opt.h. > >> >> The code was ported from the upstream zstd source repository. > > What version? zstd-1.1.4 with patches applied from upstream. I'll include it in the next patch version. >> `linux/zstd.h` header was modified to match linux kernel style. >> The cross-platform and allocation code was stripped out. Instead zstd >> requires the caller to pass a preallocated workspace. The source files >> were clang-formatted [1] to match the Linux Kernel style as much as >> possible. > > It would be easier to compare to the upstream version if it was not all > reformatted. There is a chance that bugs were introduced by Linux-specific > changes, and it would be nice if they could be easily reviewed. (Also I don't > know what clang-format settings you used, but there are still a lot of > differences from the Linux coding style.) The clang-format settings I used are available in the zstd repo [1]. I left the line length long, since it looked terrible otherwise.I set up a branch in my zstd GitHub fork called "original-formatted" [2]. I've taken the source I based the kernel patches off of [3] and ran clang-format without any other changes. If you have any suggestions to improve the clang-formatting please let me know. >> >> I benchmarked zstd compression as a special character device. I ran zstd >> and zlib compression at several levels, as well as performing no >> compression, which measure the time spent copying the data to kernel space. >> Data is passed to the compresser 4096 B at a time. The benchmark file is >> located in the upstream zstd source repository under >> `contrib/linux-kernel/zstd_compress_test.c` [2]. >> >> I ran the benchmarks on a Ubuntu 14.04 VM with 2 cores and 4 GiB of RAM. >> The VM is running on a MacBook Pro with a 3.1 GHz Intel Core i7 processor, >> 16 GB of RAM, and a SSD. I benchmarked using `silesia.tar` [3], which is >> 211,988,480 B large. Run the following commands for the benchmark: >> >> sudo modprobe zstd_compress_test >> sudo mknod zstd_compress_test c 245 0 >> sudo cp silesia.tar zstd_compress_test >> >> The time is reported by the time of the userland `cp`. >> The MB/s is computed with >> >> 1,536,217,008 B / time(buffer size, hash) >> >> which includes the time to copy from userland. >> The Adjusted MB/s is computed with >> >> 1,536,217,088 B / (time(buffer size, hash) - time(buffer size, none)). >> >> The memory reported is the amount of memory the compressor requests. >> >> | Method | Size (B) | Time (s) | Ratio | MB/s| Adj MB/s | Mem (MB) | >> |--|--|--|---|-|--|--| >> | none | 11988480 |0.100 | 1 | 2119.88 |- |- | >> | zstd -1 | 73645762 |1.044 | 2.878 | 203.05 | 224.56 | 1.23 | >> | zstd -3 | 66988878 |1.761 | 3.165 | 120.38 | 127.63 | 2.47 | >> | zstd -5 | 65001259 |2.563 | 3.261 | 82.71 |86.07 | 2.86 | >> | zstd -10 | 60165346 | 13.242 | 3.523 | 16.01 |16.13 |13.22 | >> | zstd -15 | 58009756 | 47.601 | 3.654 |4.45 | 4.46 |21.61 | >> | zstd -19 | 54014593 | 102.835 | 3.925 |2.06 | 2.06 |60.15 | >> | zlib -1 | 77260026 |2.895 | 2.744 | 73.23 |75.85 | 0.27 | >> | zlib -3 | 72972206 |4.116 | 2.905 | 51.50 |52.79 | 0.27 | >> | zlib -6 | 68190360 |9.633 | 3.109 | 22.01 |22.24 | 0.27 | >> | zlib -9 | 67613382 | 22.554 | 3.135 |9.40 | 9.44 | 0.27 | >> > > Theses benchmarks are misleading because they compress the whole file as a > single stream without resetting the dictionary, which isn't how data will > typically be compressed in kernel mode. With filesystem compression the data > has to be divided into small chunks that can each be decompressed > independently. > That eliminates one of the primary advantages of Zstandard (support for large > dictionary sizes). This benchmark isn't meant to be representative of a filesystem scenario. I wanted to show off zstd without anything else going on. Even in filesystems where the data is chunked, zstd uses the whole chunk as the window (128 KB in BtrFS and SquashFS by default), where zlib uses 32 KB. I have benchmarks for BtrFS and SquashFS in their respective patches [4][5], and I've copied the BtrFS table below (which was run with 2 threads). | Method | Ratio | Compression MB/s | Decompression speed | |-|---|--|-| | None| 0.99 | 504 | 686 | | lzo | 1.66 |
Re: [PATCH v5 2/5] lib: Add zstd modules
On 08/10/2017 03:00 PM, Eric Biggers wrote: On Thu, Aug 10, 2017 at 01:41:21PM -0400, Chris Mason wrote: On 08/10/2017 04:30 AM, Eric Biggers wrote: On Wed, Aug 09, 2017 at 07:35:53PM -0700, Nick Terrell wrote: The memory reported is the amount of memory the compressor requests. | Method | Size (B) | Time (s) | Ratio | MB/s| Adj MB/s | Mem (MB) | |--|--|--|---|-|--|--| | none | 11988480 |0.100 | 1 | 2119.88 |- |- | | zstd -1 | 73645762 |1.044 | 2.878 | 203.05 | 224.56 | 1.23 | | zstd -3 | 66988878 |1.761 | 3.165 | 120.38 | 127.63 | 2.47 | | zstd -5 | 65001259 |2.563 | 3.261 | 82.71 |86.07 | 2.86 | | zstd -10 | 60165346 | 13.242 | 3.523 | 16.01 |16.13 |13.22 | | zstd -15 | 58009756 | 47.601 | 3.654 |4.45 | 4.46 |21.61 | | zstd -19 | 54014593 | 102.835 | 3.925 |2.06 | 2.06 |60.15 | | zlib -1 | 77260026 |2.895 | 2.744 | 73.23 |75.85 | 0.27 | | zlib -3 | 72972206 |4.116 | 2.905 | 51.50 |52.79 | 0.27 | | zlib -6 | 68190360 |9.633 | 3.109 | 22.01 |22.24 | 0.27 | | zlib -9 | 67613382 | 22.554 | 3.135 |9.40 | 9.44 | 0.27 | Theses benchmarks are misleading because they compress the whole file as a single stream without resetting the dictionary, which isn't how data will typically be compressed in kernel mode. With filesystem compression the data has to be divided into small chunks that can each be decompressed independently. That eliminates one of the primary advantages of Zstandard (support for large dictionary sizes). I did btrfs benchmarks of kernel trees and other normal data sets as well. The numbers were in line with what Nick is posting here. zstd is a big win over both lzo and zlib from a btrfs point of view. It's true Nick's patches only support a single compression level in btrfs, but that's because btrfs doesn't have a way to pass in the compression ratio. It could easily be a mount option, it was just outside the scope of Nick's initial work. I am not surprised --- Zstandard is closer to the state of the art, both format-wise and implementation-wise, than the other choices in BTRFS. My point is that benchmarks need to account for how much data is compressed at a time. This is a common mistake when comparing different compression algorithms; the algorithm name and compression level do not tell the whole story. The dictionary size is extremely significant. No one is going to compress or decompress a 200 MB file as a single stream in kernel mode, so it does not make sense to justify adding Zstandard *to the kernel* based on such a benchmark. It is going to be divided into chunks. How big are the chunks in BTRFS? I thought that it compressed only one page (4 KiB) at a time, but I hope that has been, or is being, improved; 32 KiB - 128 KiB should be a better amount. (And if the amount of data compressed at a time happens to be different between the different algorithms, note that BTRFS benchmarks are likely to be measuring that as much as the algorithms themselves.) Btrfs hooks the compression code into the delayed allocation mechanism we use to gather large extents for COW. So if you write 100MB to a file, we'll have 100MB to compress at a time (within the limits of the amount of pages we allow to collect before forcing it down). But we want to balance how much memory you might need to uncompress during random reads. So we have an artificial limit of 128KB that we send at a time to the compression code. It's easy to change this, it's just a tradeoff made to limit the cost of reading small bits. It's the same for zlib,lzo and the new zstd patch. -chris -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH v5 2/5] lib: Add zstd modules
On Thu, Aug 10, 2017 at 01:41:21PM -0400, Chris Mason wrote: > On 08/10/2017 04:30 AM, Eric Biggers wrote: > >On Wed, Aug 09, 2017 at 07:35:53PM -0700, Nick Terrell wrote: > > >>The memory reported is the amount of memory the compressor requests. > >> > >>| Method | Size (B) | Time (s) | Ratio | MB/s| Adj MB/s | Mem (MB) | > >>|--|--|--|---|-|--|--| > >>| none | 11988480 |0.100 | 1 | 2119.88 |- |- | > >>| zstd -1 | 73645762 |1.044 | 2.878 | 203.05 | 224.56 | 1.23 | > >>| zstd -3 | 66988878 |1.761 | 3.165 | 120.38 | 127.63 | 2.47 | > >>| zstd -5 | 65001259 |2.563 | 3.261 | 82.71 |86.07 | 2.86 | > >>| zstd -10 | 60165346 | 13.242 | 3.523 | 16.01 |16.13 |13.22 | > >>| zstd -15 | 58009756 | 47.601 | 3.654 |4.45 | 4.46 |21.61 | > >>| zstd -19 | 54014593 | 102.835 | 3.925 |2.06 | 2.06 |60.15 | > >>| zlib -1 | 77260026 |2.895 | 2.744 | 73.23 |75.85 | 0.27 | > >>| zlib -3 | 72972206 |4.116 | 2.905 | 51.50 |52.79 | 0.27 | > >>| zlib -6 | 68190360 |9.633 | 3.109 | 22.01 |22.24 | 0.27 | > >>| zlib -9 | 67613382 | 22.554 | 3.135 |9.40 | 9.44 | 0.27 | > >> > > > >Theses benchmarks are misleading because they compress the whole file as a > >single stream without resetting the dictionary, which isn't how data will > >typically be compressed in kernel mode. With filesystem compression the data > >has to be divided into small chunks that can each be decompressed > >independently. > >That eliminates one of the primary advantages of Zstandard (support for large > >dictionary sizes). > > I did btrfs benchmarks of kernel trees and other normal data sets as > well. The numbers were in line with what Nick is posting here. > zstd is a big win over both lzo and zlib from a btrfs point of view. > > It's true Nick's patches only support a single compression level in > btrfs, but that's because btrfs doesn't have a way to pass in the > compression ratio. It could easily be a mount option, it was just > outside the scope of Nick's initial work. > I am not surprised --- Zstandard is closer to the state of the art, both format-wise and implementation-wise, than the other choices in BTRFS. My point is that benchmarks need to account for how much data is compressed at a time. This is a common mistake when comparing different compression algorithms; the algorithm name and compression level do not tell the whole story. The dictionary size is extremely significant. No one is going to compress or decompress a 200 MB file as a single stream in kernel mode, so it does not make sense to justify adding Zstandard *to the kernel* based on such a benchmark. It is going to be divided into chunks. How big are the chunks in BTRFS? I thought that it compressed only one page (4 KiB) at a time, but I hope that has been, or is being, improved; 32 KiB - 128 KiB should be a better amount. (And if the amount of data compressed at a time happens to be different between the different algorithms, note that BTRFS benchmarks are likely to be measuring that as much as the algorithms themselves.) Eric -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH v5 2/5] lib: Add zstd modules
On 2017-08-10 13:24, Eric Biggers wrote: On Thu, Aug 10, 2017 at 07:32:18AM -0400, Austin S. Hemmelgarn wrote: On 2017-08-10 04:30, Eric Biggers wrote: On Wed, Aug 09, 2017 at 07:35:53PM -0700, Nick Terrell wrote: It can compress at speeds approaching lz4, and quality approaching lzma. Well, for a very loose definition of "approaching", and certainly not at the same time. I doubt there's a use case for using the highest compression levels in kernel mode --- especially the ones using zstd_opt.h. Large data-sets with WORM access patterns and infrequent writes immediately come to mind as a use case for the highest compression level. As a more specific example, the company I work for has a very large amount of documentation, and we keep all old versions. This is all stored on a file server which is currently using BTRFS. Once a document is written, it's almost never rewritten, so write performance only matters for the first write. However, they're read back pretty frequently, so we need good read performance. As of right now, the system is set to use LZO compression by default, and then when a new document is added, the previous version of that document gets re-compressed using zlib compression, which actually results in pretty significant space savings most of the time. I would absolutely love to use zstd compression with this system with the highest compression level, because most people don't care how long it takes to write the file out, but they do care how long it takes to read a file (even if it's an older version). This may be a reasonable use case, but note this cannot just be the regular "zstd" compression setting, since filesystem compression by default must provide reasonable performance for many different access patterns. See the patch in this series which actually adds zstd compression to btrfs; it only uses level 1. I do not see a patch which adds a higher compression mode. It would need to be a special setting like "zstdhc" that users could opt-in to on specific directories. It also would need to be compared to simply compressing in userspace. In many cases compressing in userspace is probably the better solution for the use case in question because it works on any filesystem, allows using any compression algorithm, and if random access is not needed it is possible to compress each file as a single stream (like a .xz file), which produces a much better compression ratio than the block-by-block compression that filesystems have to use. There has been discussion as well as (I think) initial patches merged for support of specifying the compression level for algorithms which support multiple compression levels in BTRFS. I was actually under the impression that we had decided to use level 3 as the default for zstd, but that apparently isn't the case, and with the benchmark issues, it may not be once proper benchmarks are run. Also, on the note of compressing in userspace, the use case I quoted at least can't do that because we have to deal with Windows clients and users have to be able to open files directly on said Windows clients. I entirely agree that real archival storage is better off using userspace compression, but sometimes real archival storage isn't an option. Note also that LZ4HC is in the kernel source tree currently but no one is using it vs. the regular LZ4. I think it is the kind of thing that sounded useful originally, but at the end of the day no one really wants to use it in kernel mode. I'd certainly be interested in actual patches, though. Part of that is the fact that BTRFS is one of the only consumers (AFAIK) of this API that can freely choose all aspects of their usage, and the consensus here (which I don't agree with I might add) amounts to the argument that 'we already have compression with a compression ratio, we don't need more things like that'. I would personally love to see LZ4HC support in BTRFS (based on testing my own use cases, LZ4 is more deterministic than LZO for both compression and decompression, and most of the non archival usage I have of BTRFS benefits from determinism), but there's not any point in me writing up such a patch because it's almost certain to get rejected because BTRFS already has LZO. The main reason that zstd is getting considered at all is that the quoted benchmarks show clear benefits in decompression speed relative to zlib and far better compression ratios than LZO. -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH v5 2/5] lib: Add zstd modules
On 08/10/2017 04:30 AM, Eric Biggers wrote: On Wed, Aug 09, 2017 at 07:35:53PM -0700, Nick Terrell wrote: The memory reported is the amount of memory the compressor requests. | Method | Size (B) | Time (s) | Ratio | MB/s| Adj MB/s | Mem (MB) | |--|--|--|---|-|--|--| | none | 11988480 |0.100 | 1 | 2119.88 |- |- | | zstd -1 | 73645762 |1.044 | 2.878 | 203.05 | 224.56 | 1.23 | | zstd -3 | 66988878 |1.761 | 3.165 | 120.38 | 127.63 | 2.47 | | zstd -5 | 65001259 |2.563 | 3.261 | 82.71 |86.07 | 2.86 | | zstd -10 | 60165346 | 13.242 | 3.523 | 16.01 |16.13 |13.22 | | zstd -15 | 58009756 | 47.601 | 3.654 |4.45 | 4.46 |21.61 | | zstd -19 | 54014593 | 102.835 | 3.925 |2.06 | 2.06 |60.15 | | zlib -1 | 77260026 |2.895 | 2.744 | 73.23 |75.85 | 0.27 | | zlib -3 | 72972206 |4.116 | 2.905 | 51.50 |52.79 | 0.27 | | zlib -6 | 68190360 |9.633 | 3.109 | 22.01 |22.24 | 0.27 | | zlib -9 | 67613382 | 22.554 | 3.135 |9.40 | 9.44 | 0.27 | Theses benchmarks are misleading because they compress the whole file as a single stream without resetting the dictionary, which isn't how data will typically be compressed in kernel mode. With filesystem compression the data has to be divided into small chunks that can each be decompressed independently. That eliminates one of the primary advantages of Zstandard (support for large dictionary sizes). I did btrfs benchmarks of kernel trees and other normal data sets as well. The numbers were in line with what Nick is posting here. zstd is a big win over both lzo and zlib from a btrfs point of view. It's true Nick's patches only support a single compression level in btrfs, but that's because btrfs doesn't have a way to pass in the compression ratio. It could easily be a mount option, it was just outside the scope of Nick's initial work. -chris -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH v5 2/5] lib: Add zstd modules
On Thu, Aug 10, 2017 at 10:57:01AM -0400, Austin S. Hemmelgarn wrote: > Also didn't think to mention this, but I could see the max level > being very popular for use with SquashFS root filesystems used in > LiveCD's. Currently, they have to decide between read performance > and image size, while zstd would provide both. The high compression levels of Zstandard are indeed a great fit for SquashFS, but SquashFS images are created in userspace by squashfs-tools. The kernel only needs to be able to decompress them. (Also, while Zstandard provides very good tradeoffs and will probably become the preferred algorithm for SquashFS, it's misleading to imply that users won't have to make decisions anymore. It does not compress as well as XZ or decompress as fast as LZ4, except maybe in very carefully crafted benchmarks.) Eric -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH v5 2/5] lib: Add zstd modules
On Thu, Aug 10, 2017 at 07:32:18AM -0400, Austin S. Hemmelgarn wrote: > On 2017-08-10 04:30, Eric Biggers wrote: > >On Wed, Aug 09, 2017 at 07:35:53PM -0700, Nick Terrell wrote: > >> > >>It can compress at speeds approaching lz4, and quality approaching lzma. > > > >Well, for a very loose definition of "approaching", and certainly not at the > >same time. I doubt there's a use case for using the highest compression > >levels > >in kernel mode --- especially the ones using zstd_opt.h. > Large data-sets with WORM access patterns and infrequent writes > immediately come to mind as a use case for the highest compression > level. > > As a more specific example, the company I work for has a very large > amount of documentation, and we keep all old versions. This is all > stored on a file server which is currently using BTRFS. Once a > document is written, it's almost never rewritten, so write > performance only matters for the first write. However, they're read > back pretty frequently, so we need good read performance. As of > right now, the system is set to use LZO compression by default, and > then when a new document is added, the previous version of that > document gets re-compressed using zlib compression, which actually > results in pretty significant space savings most of the time. I > would absolutely love to use zstd compression with this system with > the highest compression level, because most people don't care how > long it takes to write the file out, but they do care how long it > takes to read a file (even if it's an older version). This may be a reasonable use case, but note this cannot just be the regular "zstd" compression setting, since filesystem compression by default must provide reasonable performance for many different access patterns. See the patch in this series which actually adds zstd compression to btrfs; it only uses level 1. I do not see a patch which adds a higher compression mode. It would need to be a special setting like "zstdhc" that users could opt-in to on specific directories. It also would need to be compared to simply compressing in userspace. In many cases compressing in userspace is probably the better solution for the use case in question because it works on any filesystem, allows using any compression algorithm, and if random access is not needed it is possible to compress each file as a single stream (like a .xz file), which produces a much better compression ratio than the block-by-block compression that filesystems have to use. Note also that LZ4HC is in the kernel source tree currently but no one is using it vs. the regular LZ4. I think it is the kind of thing that sounded useful originally, but at the end of the day no one really wants to use it in kernel mode. I'd certainly be interested in actual patches, though. Eric -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH v5 2/5] lib: Add zstd modules
On 2017-08-10 07:32, Austin S. Hemmelgarn wrote: On 2017-08-10 04:30, Eric Biggers wrote: On Wed, Aug 09, 2017 at 07:35:53PM -0700, Nick Terrell wrote: It can compress at speeds approaching lz4, and quality approaching lzma. Well, for a very loose definition of "approaching", and certainly not at the same time. I doubt there's a use case for using the highest compression levels in kernel mode --- especially the ones using zstd_opt.h. Large data-sets with WORM access patterns and infrequent writes immediately come to mind as a use case for the highest compression level. As a more specific example, the company I work for has a very large amount of documentation, and we keep all old versions. This is all stored on a file server which is currently using BTRFS. Once a document is written, it's almost never rewritten, so write performance only matters for the first write. However, they're read back pretty frequently, so we need good read performance. As of right now, the system is set to use LZO compression by default, and then when a new document is added, the previous version of that document gets re-compressed using zlib compression, which actually results in pretty significant space savings most of the time. I would absolutely love to use zstd compression with this system with the highest compression level, because most people don't care how long it takes to write the file out, but they do care how long it takes to read a file (even if it's an older version). Also didn't think to mention this, but I could see the max level being very popular for use with SquashFS root filesystems used in LiveCD's. Currently, they have to decide between read performance and image size, while zstd would provide both. -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH v5 2/5] lib: Add zstd modules
On 2017-08-10 04:30, Eric Biggers wrote: On Wed, Aug 09, 2017 at 07:35:53PM -0700, Nick Terrell wrote: It can compress at speeds approaching lz4, and quality approaching lzma. Well, for a very loose definition of "approaching", and certainly not at the same time. I doubt there's a use case for using the highest compression levels in kernel mode --- especially the ones using zstd_opt.h. Large data-sets with WORM access patterns and infrequent writes immediately come to mind as a use case for the highest compression level. As a more specific example, the company I work for has a very large amount of documentation, and we keep all old versions. This is all stored on a file server which is currently using BTRFS. Once a document is written, it's almost never rewritten, so write performance only matters for the first write. However, they're read back pretty frequently, so we need good read performance. As of right now, the system is set to use LZO compression by default, and then when a new document is added, the previous version of that document gets re-compressed using zlib compression, which actually results in pretty significant space savings most of the time. I would absolutely love to use zstd compression with this system with the highest compression level, because most people don't care how long it takes to write the file out, but they do care how long it takes to read a file (even if it's an older version). The code was ported from the upstream zstd source repository. What version? `linux/zstd.h` header was modified to match linux kernel style. The cross-platform and allocation code was stripped out. Instead zstd requires the caller to pass a preallocated workspace. The source files were clang-formatted [1] to match the Linux Kernel style as much as possible. It would be easier to compare to the upstream version if it was not all reformatted. There is a chance that bugs were introduced by Linux-specific changes, and it would be nice if they could be easily reviewed. (Also I don't know what clang-format settings you used, but there are still a lot of differences from the Linux coding style.) I benchmarked zstd compression as a special character device. I ran zstd and zlib compression at several levels, as well as performing no compression, which measure the time spent copying the data to kernel space. Data is passed to the compresser 4096 B at a time. The benchmark file is located in the upstream zstd source repository under `contrib/linux-kernel/zstd_compress_test.c` [2]. I ran the benchmarks on a Ubuntu 14.04 VM with 2 cores and 4 GiB of RAM. The VM is running on a MacBook Pro with a 3.1 GHz Intel Core i7 processor, 16 GB of RAM, and a SSD. I benchmarked using `silesia.tar` [3], which is 211,988,480 B large. Run the following commands for the benchmark: sudo modprobe zstd_compress_test sudo mknod zstd_compress_test c 245 0 sudo cp silesia.tar zstd_compress_test The time is reported by the time of the userland `cp`. The MB/s is computed with 1,536,217,008 B / time(buffer size, hash) which includes the time to copy from userland. The Adjusted MB/s is computed with 1,536,217,088 B / (time(buffer size, hash) - time(buffer size, none)). The memory reported is the amount of memory the compressor requests. | Method | Size (B) | Time (s) | Ratio | MB/s| Adj MB/s | Mem (MB) | |--|--|--|---|-|--|--| | none | 11988480 |0.100 | 1 | 2119.88 |- |- | | zstd -1 | 73645762 |1.044 | 2.878 | 203.05 | 224.56 | 1.23 | | zstd -3 | 66988878 |1.761 | 3.165 | 120.38 | 127.63 | 2.47 | | zstd -5 | 65001259 |2.563 | 3.261 | 82.71 |86.07 | 2.86 | | zstd -10 | 60165346 | 13.242 | 3.523 | 16.01 |16.13 |13.22 | | zstd -15 | 58009756 | 47.601 | 3.654 |4.45 | 4.46 |21.61 | | zstd -19 | 54014593 | 102.835 | 3.925 |2.06 | 2.06 |60.15 | | zlib -1 | 77260026 |2.895 | 2.744 | 73.23 |75.85 | 0.27 | | zlib -3 | 72972206 |4.116 | 2.905 | 51.50 |52.79 | 0.27 | | zlib -6 | 68190360 |9.633 | 3.109 | 22.01 |22.24 | 0.27 | | zlib -9 | 67613382 | 22.554 | 3.135 |9.40 | 9.44 | 0.27 | Theses benchmarks are misleading because they compress the whole file as a single stream without resetting the dictionary, which isn't how data will typically be compressed in kernel mode. With filesystem compression the data has to be divided into small chunks that can each be decompressed independently. That eliminates one of the primary advantages of Zstandard (support for large dictionary sizes). -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH v5 2/5] lib: Add zstd modules
On Wed, Aug 09, 2017 at 07:35:53PM -0700, Nick Terrell wrote: > > It can compress at speeds approaching lz4, and quality approaching lzma. Well, for a very loose definition of "approaching", and certainly not at the same time. I doubt there's a use case for using the highest compression levels in kernel mode --- especially the ones using zstd_opt.h. > > The code was ported from the upstream zstd source repository. What version? > `linux/zstd.h` header was modified to match linux kernel style. > The cross-platform and allocation code was stripped out. Instead zstd > requires the caller to pass a preallocated workspace. The source files > were clang-formatted [1] to match the Linux Kernel style as much as > possible. It would be easier to compare to the upstream version if it was not all reformatted. There is a chance that bugs were introduced by Linux-specific changes, and it would be nice if they could be easily reviewed. (Also I don't know what clang-format settings you used, but there are still a lot of differences from the Linux coding style.) > > I benchmarked zstd compression as a special character device. I ran zstd > and zlib compression at several levels, as well as performing no > compression, which measure the time spent copying the data to kernel space. > Data is passed to the compresser 4096 B at a time. The benchmark file is > located in the upstream zstd source repository under > `contrib/linux-kernel/zstd_compress_test.c` [2]. > > I ran the benchmarks on a Ubuntu 14.04 VM with 2 cores and 4 GiB of RAM. > The VM is running on a MacBook Pro with a 3.1 GHz Intel Core i7 processor, > 16 GB of RAM, and a SSD. I benchmarked using `silesia.tar` [3], which is > 211,988,480 B large. Run the following commands for the benchmark: > > sudo modprobe zstd_compress_test > sudo mknod zstd_compress_test c 245 0 > sudo cp silesia.tar zstd_compress_test > > The time is reported by the time of the userland `cp`. > The MB/s is computed with > > 1,536,217,008 B / time(buffer size, hash) > > which includes the time to copy from userland. > The Adjusted MB/s is computed with > > 1,536,217,088 B / (time(buffer size, hash) - time(buffer size, none)). > > The memory reported is the amount of memory the compressor requests. > > | Method | Size (B) | Time (s) | Ratio | MB/s| Adj MB/s | Mem (MB) | > |--|--|--|---|-|--|--| > | none | 11988480 |0.100 | 1 | 2119.88 |- |- | > | zstd -1 | 73645762 |1.044 | 2.878 | 203.05 | 224.56 | 1.23 | > | zstd -3 | 66988878 |1.761 | 3.165 | 120.38 | 127.63 | 2.47 | > | zstd -5 | 65001259 |2.563 | 3.261 | 82.71 |86.07 | 2.86 | > | zstd -10 | 60165346 | 13.242 | 3.523 | 16.01 |16.13 |13.22 | > | zstd -15 | 58009756 | 47.601 | 3.654 |4.45 | 4.46 |21.61 | > | zstd -19 | 54014593 | 102.835 | 3.925 |2.06 | 2.06 |60.15 | > | zlib -1 | 77260026 |2.895 | 2.744 | 73.23 |75.85 | 0.27 | > | zlib -3 | 72972206 |4.116 | 2.905 | 51.50 |52.79 | 0.27 | > | zlib -6 | 68190360 |9.633 | 3.109 | 22.01 |22.24 | 0.27 | > | zlib -9 | 67613382 | 22.554 | 3.135 |9.40 | 9.44 | 0.27 | > Theses benchmarks are misleading because they compress the whole file as a single stream without resetting the dictionary, which isn't how data will typically be compressed in kernel mode. With filesystem compression the data has to be divided into small chunks that can each be decompressed independently. That eliminates one of the primary advantages of Zstandard (support for large dictionary sizes). Eric -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html