Re: [PATCH v5 2/5] lib: Add zstd modules

2017-08-14 Thread David Sterba
On Fri, Aug 11, 2017 at 09:20:10AM -0400, Chris Mason wrote:
> 
> 
> On 08/10/2017 03:25 PM, Hugo Mills wrote:
> > On Thu, Aug 10, 2017 at 01:41:21PM -0400, Chris Mason wrote:
> >> On 08/10/2017 04:30 AM, Eric Biggers wrote:
> >>>
> >>> Theses benchmarks are misleading because they compress the whole file as a
> >>> single stream without resetting the dictionary, which isn't how data will
> >>> typically be compressed in kernel mode.  With filesystem compression the 
> >>> data
> >>> has to be divided into small chunks that can each be decompressed 
> >>> independently.
> >>> That eliminates one of the primary advantages of Zstandard (support for 
> >>> large
> >>> dictionary sizes).
> >>
> >> I did btrfs benchmarks of kernel trees and other normal data sets as
> >> well.  The numbers were in line with what Nick is posting here.
> >> zstd is a big win over both lzo and zlib from a btrfs point of view.
> >>
> >> It's true Nick's patches only support a single compression level in
> >> btrfs, but that's because btrfs doesn't have a way to pass in the
> >> compression ratio.  It could easily be a mount option, it was just
> >> outside the scope of Nick's initial work.
> > 
> > Could we please not add more mount options? I get that they're easy
> > to implement, but it's a very blunt instrument. What we tend to see
> > (with both nodatacow and compress) is people using the mount options,
> > then asking for exceptions, discovering that they can't do that, and
> > then falling back to doing it with attributes or btrfs properties.
> > Could we just start with btrfs properties this time round, and cut out
> > the mount option part of this cycle.
> > 
> > In the long run, it'd be great to see most of the btrfs-specific
> > mount options get deprecated and ultimately removed entirely, in
> > favour of attributes/properties, where feasible.
> > 
> 
> It's a good point, and as was commented later down I'd just do mount -o 
> compress=zstd:3 or something.

We've solved that already, here's a proposed patch that extends current
mount options,

https://www.mail-archive.com/linux-btrfs@vger.kernel.org/msg66248.html

and a patch that could be backported to older kernels so the new mount
options do not break mounts on older kernels with the new syntax

https://patchwork.kernel.org/patch/9845697/

--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH v5 2/5] lib: Add zstd modules

2017-08-11 Thread Chris Mason



On 08/10/2017 03:25 PM, Hugo Mills wrote:

On Thu, Aug 10, 2017 at 01:41:21PM -0400, Chris Mason wrote:

On 08/10/2017 04:30 AM, Eric Biggers wrote:


Theses benchmarks are misleading because they compress the whole file as a
single stream without resetting the dictionary, which isn't how data will
typically be compressed in kernel mode.  With filesystem compression the data
has to be divided into small chunks that can each be decompressed independently.
That eliminates one of the primary advantages of Zstandard (support for large
dictionary sizes).


I did btrfs benchmarks of kernel trees and other normal data sets as
well.  The numbers were in line with what Nick is posting here.
zstd is a big win over both lzo and zlib from a btrfs point of view.

It's true Nick's patches only support a single compression level in
btrfs, but that's because btrfs doesn't have a way to pass in the
compression ratio.  It could easily be a mount option, it was just
outside the scope of Nick's initial work.


Could we please not add more mount options? I get that they're easy
to implement, but it's a very blunt instrument. What we tend to see
(with both nodatacow and compress) is people using the mount options,
then asking for exceptions, discovering that they can't do that, and
then falling back to doing it with attributes or btrfs properties.
Could we just start with btrfs properties this time round, and cut out
the mount option part of this cycle.

In the long run, it'd be great to see most of the btrfs-specific
mount options get deprecated and ultimately removed entirely, in
favour of attributes/properties, where feasible.



It's a good point, and as was commented later down I'd just do mount -o 
compress=zstd:3 or something.


But I do prefer properties in general for this.  My big point was just 
that next step is outside of Nick's scope.


-chris

--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH v5 2/5] lib: Add zstd modules

2017-08-10 Thread Austin S. Hemmelgarn

On 2017-08-10 15:25, Hugo Mills wrote:

On Thu, Aug 10, 2017 at 01:41:21PM -0400, Chris Mason wrote:

On 08/10/2017 04:30 AM, Eric Biggers wrote:


Theses benchmarks are misleading because they compress the whole file as a
single stream without resetting the dictionary, which isn't how data will
typically be compressed in kernel mode.  With filesystem compression the data
has to be divided into small chunks that can each be decompressed independently.
That eliminates one of the primary advantages of Zstandard (support for large
dictionary sizes).


I did btrfs benchmarks of kernel trees and other normal data sets as
well.  The numbers were in line with what Nick is posting here.
zstd is a big win over both lzo and zlib from a btrfs point of view.

It's true Nick's patches only support a single compression level in
btrfs, but that's because btrfs doesn't have a way to pass in the
compression ratio.  It could easily be a mount option, it was just
outside the scope of Nick's initial work.


Could we please not add more mount options? I get that they're easy
to implement, but it's a very blunt instrument. What we tend to see
(with both nodatacow and compress) is people using the mount options,
then asking for exceptions, discovering that they can't do that, and
then falling back to doing it with attributes or btrfs properties.
Could we just start with btrfs properties this time round, and cut out
the mount option part of this cycle.
AFAIUI, the intent is to extend the compression type specification for 
both the mount options and the property, not to add a new mount option. 
I think we all agree that `mount -o compress=zstd3` is a lot better than 
`mount -o compress=zstd,compresslevel=3`.


In the long run, it'd be great to see most of the btrfs-specific
mount options get deprecated and ultimately removed entirely, in
favour of attributes/properties, where feasible.
Are properties set on the root subvolume inherited properly?  Because 
unless they are, we can't get the same semantics.


Two other counter arguments on completely removing BTRFS-specific mount 
options:
1. It's a lot easier and a lot more clearly defined to change things 
that affect global behavior of the FS by a remount than having to 
iterate everything in the FS to update properties.  If I'm disabling 
autodefrag, I'd much rather just `mount -o remount,noautodefrag` than 
`find / -xdev -exec btrfs property set \{\} autodefrag false`, as the 
first will take effect for everything simultaneously and run 
exponentially quicker.
2. There are some things that don't make sense as per-object settings or 
are otherwise nonsensical on objects.  Many, but not all, of the BTRFS 
specific mount options fall into this category IMO, with the notable 
exception of compress[-force], [no]autodefrag, [no]datacow, and 
[no]datasum.  Some other options do make sense as properties of the 
filesystem (commit, flushoncommit, {inode,space}_cache, max_inline, 
metadata_ratio, [no]ssd, and [no]treelog are such options), but many are 
one-off options that affect behavior on mount (like skip_balance, 
clear_cache, nologreplay, norecovery, usebbackuproot, and subvol).

--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH v5 2/5] lib: Add zstd modules

2017-08-10 Thread Hugo Mills
On Thu, Aug 10, 2017 at 01:41:21PM -0400, Chris Mason wrote:
> On 08/10/2017 04:30 AM, Eric Biggers wrote:
> >
> >Theses benchmarks are misleading because they compress the whole file as a
> >single stream without resetting the dictionary, which isn't how data will
> >typically be compressed in kernel mode.  With filesystem compression the data
> >has to be divided into small chunks that can each be decompressed 
> >independently.
> >That eliminates one of the primary advantages of Zstandard (support for large
> >dictionary sizes).
> 
> I did btrfs benchmarks of kernel trees and other normal data sets as
> well.  The numbers were in line with what Nick is posting here.
> zstd is a big win over both lzo and zlib from a btrfs point of view.
> 
> It's true Nick's patches only support a single compression level in
> btrfs, but that's because btrfs doesn't have a way to pass in the
> compression ratio.  It could easily be a mount option, it was just
> outside the scope of Nick's initial work.

   Could we please not add more mount options? I get that they're easy
to implement, but it's a very blunt instrument. What we tend to see
(with both nodatacow and compress) is people using the mount options,
then asking for exceptions, discovering that they can't do that, and
then falling back to doing it with attributes or btrfs properties.
Could we just start with btrfs properties this time round, and cut out
the mount option part of this cycle.

   In the long run, it'd be great to see most of the btrfs-specific
mount options get deprecated and ultimately removed entirely, in
favour of attributes/properties, where feasible.

   Hugo.

-- 
Hugo Mills | Klytus! Are your men on the right pills? Maybe you
hugo@... carfax.org.uk | should execute their trainer!
http://carfax.org.uk/  |
PGP: E2AB1DE4  |  Ming the Merciless, Flash Gordon


signature.asc
Description: Digital signature


Re: [PATCH v5 2/5] lib: Add zstd modules

2017-08-10 Thread Nick Terrell
On 8/10/17, 10:48 AM, "Austin S. Hemmelgarn"  wrote:
>On 2017-08-10 13:24, Eric Biggers wrote:
>>On Thu, Aug 10, 2017 at 07:32:18AM -0400, Austin S. Hemmelgarn wrote:
>>>On 2017-08-10 04:30, Eric Biggers wrote:
On Wed, Aug 09, 2017 at 07:35:53PM -0700, Nick Terrell wrote:
>
> It can compress at speeds approaching lz4, and quality approaching lzma.

 Well, for a very loose definition of "approaching", and certainly not at 
 the
 same time.  I doubt there's a use case for using the highest compression 
 levels
 in kernel mode --- especially the ones using zstd_opt.h.
>>> Large data-sets with WORM access patterns and infrequent writes
>>> immediately come to mind as a use case for the highest compression
>>> level.
>>>
>>> As a more specific example, the company I work for has a very large
>>> amount of documentation, and we keep all old versions.  This is all
>>> stored on a file server which is currently using BTRFS.  Once a
>>> document is written, it's almost never rewritten, so write
>>> performance only matters for the first write.  However, they're read
>>> back pretty frequently, so we need good read performance.  As of
>>> right now, the system is set to use LZO compression by default, and
>>> then when a new document is added, the previous version of that
>>> document gets re-compressed using zlib compression, which actually
>>> results in pretty significant space savings most of the time.  I
>>> would absolutely love to use zstd compression with this system with
>>> the highest compression level, because most people don't care how
>>> long it takes to write the file out, but they do care how long it
>>> takes to read a file (even if it's an older version).
>> 
>> This may be a reasonable use case, but note this cannot just be the regular
>> "zstd" compression setting, since filesystem compression by default must 
>> provide
>> reasonable performance for many different access patterns.  See the patch in
>> this series which actually adds zstd compression to btrfs; it only uses 
>> level 1.
>> I do not see a patch which adds a higher compression mode.  It would need to 
>> be
>> a special setting like "zstdhc" that users could opt-in to on specific
>> directories.  It also would need to be compared to simply compressing in
>> userspace.  In many cases compressing in userspace is probably the better
>> solution for the use case in question because it works on any filesystem, 
>> allows
>> using any compression algorithm, and if random access is not needed it is
>> possible to compress each file as a single stream (like a .xz file), which
>> produces a much better compression ratio than the block-by-block compression
>> that filesystems have to use.
> There has been discussion as well as (I think) initial patches merged 
> for support of specifying the compression level for algorithms which 
> support multiple compression levels in BTRFS.  I was actually under the 
> impression that we had decided to use level 3 as the default for zstd, 
> but that apparently isn't the case, and with the benchmark issues, it 
> may not be once proper benchmarks are run.

There are some initial patches to add compression levels to BtrFS [1]. Once
it's ready, we can add compression levels to zstd. The default compression
level in the current patch is 3.

[1] https://lkml.kernel.org/r/20170724172939.24527-1-dste...@suse.com


N�r��yb�X��ǧv�^�)޺{.n�+{�n�߲)w*jg����ݢj/���z�ޖ��2�ޙ&�)ߡ�a�����G���h��j:+v���w��٥

Re: [PATCH v5 2/5] lib: Add zstd modules

2017-08-10 Thread Nick Terrell
On 8/10/17, 1:30 AM, "Eric Biggers"  wrote:
> On Wed, Aug 09, 2017 at 07:35:53PM -0700, Nick Terrell wrote:
>>
>> It can compress at speeds approaching lz4, and quality approaching lzma.
> 
> Well, for a very loose definition of "approaching", and certainly not at the
> same time.  I doubt there's a use case for using the highest compression 
> levels
> in kernel mode --- especially the ones using zstd_opt.h.
> 
>> 
>> The code was ported from the upstream zstd source repository.
> 
> What version?

zstd-1.1.4 with patches applied from upstream. I'll include it in the
next patch version.

>> `linux/zstd.h` header was modified to match linux kernel style.
>> The cross-platform and allocation code was stripped out. Instead zstd
>> requires the caller to pass a preallocated workspace. The source files
>> were clang-formatted [1] to match the Linux Kernel style as much as
>> possible. 
> 
> It would be easier to compare to the upstream version if it was not all
> reformatted.  There is a chance that bugs were introduced by Linux-specific
> changes, and it would be nice if they could be easily reviewed.  (Also I don't
> know what clang-format settings you used, but there are still a lot of
> differences from the Linux coding style.)

The clang-format settings I used are available in the zstd repo [1]. I left
the line length long, since it looked terrible otherwise.I set up a branch
in my zstd GitHub fork called "original-formatted" [2]. I've taken the source
I based the kernel patches off of [3] and ran clang-format without any other
changes. If you have any suggestions to improve the clang-formatting
please let me know.

>> 
>> I benchmarked zstd compression as a special character device. I ran zstd
>> and zlib compression at several levels, as well as performing no
>> compression, which measure the time spent copying the data to kernel space.
>> Data is passed to the compresser 4096 B at a time. The benchmark file is
>> located in the upstream zstd source repository under
>> `contrib/linux-kernel/zstd_compress_test.c` [2].
>> 
>> I ran the benchmarks on a Ubuntu 14.04 VM with 2 cores and 4 GiB of RAM.
>> The VM is running on a MacBook Pro with a 3.1 GHz Intel Core i7 processor,
>> 16 GB of RAM, and a SSD. I benchmarked using `silesia.tar` [3], which is
>> 211,988,480 B large. Run the following commands for the benchmark:
>> 
>> sudo modprobe zstd_compress_test
>> sudo mknod zstd_compress_test c 245 0
>> sudo cp silesia.tar zstd_compress_test
>> 
>> The time is reported by the time of the userland `cp`.
>> The MB/s is computed with
>> 
>> 1,536,217,008 B / time(buffer size, hash)
>> 
>> which includes the time to copy from userland.
>> The Adjusted MB/s is computed with
>> 
>> 1,536,217,088 B / (time(buffer size, hash) - time(buffer size, none)).
>> 
>> The memory reported is the amount of memory the compressor requests.
>> 
>> | Method   | Size (B) | Time (s) | Ratio | MB/s| Adj MB/s | Mem (MB) |
>> |--|--|--|---|-|--|--|
>> | none | 11988480 |0.100 | 1 | 2119.88 |- |- |
>> | zstd -1  | 73645762 |1.044 | 2.878 |  203.05 |   224.56 | 1.23 |
>> | zstd -3  | 66988878 |1.761 | 3.165 |  120.38 |   127.63 | 2.47 |
>> | zstd -5  | 65001259 |2.563 | 3.261 |   82.71 |86.07 | 2.86 |
>> | zstd -10 | 60165346 |   13.242 | 3.523 |   16.01 |16.13 |13.22 |
>> | zstd -15 | 58009756 |   47.601 | 3.654 |4.45 | 4.46 |21.61 |
>> | zstd -19 | 54014593 |  102.835 | 3.925 |2.06 | 2.06 |60.15 |
>> | zlib -1  | 77260026 |2.895 | 2.744 |   73.23 |75.85 | 0.27 |
>> | zlib -3  | 72972206 |4.116 | 2.905 |   51.50 |52.79 | 0.27 |
>> | zlib -6  | 68190360 |9.633 | 3.109 |   22.01 |22.24 | 0.27 |
>> | zlib -9  | 67613382 |   22.554 | 3.135 |9.40 | 9.44 | 0.27 |
>> 
> 
> Theses benchmarks are misleading because they compress the whole file as a
> single stream without resetting the dictionary, which isn't how data will
> typically be compressed in kernel mode.  With filesystem compression the data
> has to be divided into small chunks that can each be decompressed 
> independently.
> That eliminates one of the primary advantages of Zstandard (support for large
> dictionary sizes).

This benchmark isn't meant to be representative of a filesystem scenario. I
wanted to show off zstd without anything else going on. Even in filesystems
where the data is chunked, zstd uses the whole chunk as the window (128 KB
in BtrFS and SquashFS by default), where zlib uses 32 KB. I have benchmarks
for BtrFS and SquashFS in their respective patches [4][5], and I've copied
the BtrFS table below (which was run with 2 threads).

| Method  | Ratio | Compression MB/s | Decompression speed |
|-|---|--|-|
| None|  0.99 |  504 | 686 |
| lzo |  1.66 |

Re: [PATCH v5 2/5] lib: Add zstd modules

2017-08-10 Thread Chris Mason

On 08/10/2017 03:00 PM, Eric Biggers wrote:

On Thu, Aug 10, 2017 at 01:41:21PM -0400, Chris Mason wrote:

On 08/10/2017 04:30 AM, Eric Biggers wrote:

On Wed, Aug 09, 2017 at 07:35:53PM -0700, Nick Terrell wrote:



The memory reported is the amount of memory the compressor requests.

| Method   | Size (B) | Time (s) | Ratio | MB/s| Adj MB/s | Mem (MB) |
|--|--|--|---|-|--|--|
| none | 11988480 |0.100 | 1 | 2119.88 |- |- |
| zstd -1  | 73645762 |1.044 | 2.878 |  203.05 |   224.56 | 1.23 |
| zstd -3  | 66988878 |1.761 | 3.165 |  120.38 |   127.63 | 2.47 |
| zstd -5  | 65001259 |2.563 | 3.261 |   82.71 |86.07 | 2.86 |
| zstd -10 | 60165346 |   13.242 | 3.523 |   16.01 |16.13 |13.22 |
| zstd -15 | 58009756 |   47.601 | 3.654 |4.45 | 4.46 |21.61 |
| zstd -19 | 54014593 |  102.835 | 3.925 |2.06 | 2.06 |60.15 |
| zlib -1  | 77260026 |2.895 | 2.744 |   73.23 |75.85 | 0.27 |
| zlib -3  | 72972206 |4.116 | 2.905 |   51.50 |52.79 | 0.27 |
| zlib -6  | 68190360 |9.633 | 3.109 |   22.01 |22.24 | 0.27 |
| zlib -9  | 67613382 |   22.554 | 3.135 |9.40 | 9.44 | 0.27 |



Theses benchmarks are misleading because they compress the whole file as a
single stream without resetting the dictionary, which isn't how data will
typically be compressed in kernel mode.  With filesystem compression the data
has to be divided into small chunks that can each be decompressed independently.
That eliminates one of the primary advantages of Zstandard (support for large
dictionary sizes).


I did btrfs benchmarks of kernel trees and other normal data sets as
well.  The numbers were in line with what Nick is posting here.
zstd is a big win over both lzo and zlib from a btrfs point of view.

It's true Nick's patches only support a single compression level in
btrfs, but that's because btrfs doesn't have a way to pass in the
compression ratio.  It could easily be a mount option, it was just
outside the scope of Nick's initial work.



I am not surprised --- Zstandard is closer to the state of the art, both
format-wise and implementation-wise, than the other choices in BTRFS.  My point
is that benchmarks need to account for how much data is compressed at a time.
This is a common mistake when comparing different compression algorithms; the
algorithm name and compression level do not tell the whole story.  The
dictionary size is extremely significant.  No one is going to compress or
decompress a 200 MB file as a single stream in kernel mode, so it does not make
sense to justify adding Zstandard *to the kernel* based on such a benchmark.  It
is going to be divided into chunks.  How big are the chunks in BTRFS?  I thought
that it compressed only one page (4 KiB) at a time, but I hope that has been, or
is being, improved; 32 KiB - 128 KiB should be a better amount.  (And if the
amount of data compressed at a time happens to be different between the
different algorithms, note that BTRFS benchmarks are likely to be measuring that
as much as the algorithms themselves.)


Btrfs hooks the compression code into the delayed allocation mechanism 
we use to gather large extents for COW.  So if you write 100MB to a 
file, we'll have 100MB to compress at a time (within the limits of the 
amount of pages we allow to collect before forcing it down).


But we want to balance how much memory you might need to uncompress 
during random reads.  So we have an artificial limit of 128KB that we 
send at a time to the compression code.  It's easy to change this, it's 
just a tradeoff made to limit the cost of reading small bits.


It's the same for zlib,lzo and the new zstd patch.

-chris

--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH v5 2/5] lib: Add zstd modules

2017-08-10 Thread Eric Biggers
On Thu, Aug 10, 2017 at 01:41:21PM -0400, Chris Mason wrote:
> On 08/10/2017 04:30 AM, Eric Biggers wrote:
> >On Wed, Aug 09, 2017 at 07:35:53PM -0700, Nick Terrell wrote:
> 
> >>The memory reported is the amount of memory the compressor requests.
> >>
> >>| Method   | Size (B) | Time (s) | Ratio | MB/s| Adj MB/s | Mem (MB) |
> >>|--|--|--|---|-|--|--|
> >>| none | 11988480 |0.100 | 1 | 2119.88 |- |- |
> >>| zstd -1  | 73645762 |1.044 | 2.878 |  203.05 |   224.56 | 1.23 |
> >>| zstd -3  | 66988878 |1.761 | 3.165 |  120.38 |   127.63 | 2.47 |
> >>| zstd -5  | 65001259 |2.563 | 3.261 |   82.71 |86.07 | 2.86 |
> >>| zstd -10 | 60165346 |   13.242 | 3.523 |   16.01 |16.13 |13.22 |
> >>| zstd -15 | 58009756 |   47.601 | 3.654 |4.45 | 4.46 |21.61 |
> >>| zstd -19 | 54014593 |  102.835 | 3.925 |2.06 | 2.06 |60.15 |
> >>| zlib -1  | 77260026 |2.895 | 2.744 |   73.23 |75.85 | 0.27 |
> >>| zlib -3  | 72972206 |4.116 | 2.905 |   51.50 |52.79 | 0.27 |
> >>| zlib -6  | 68190360 |9.633 | 3.109 |   22.01 |22.24 | 0.27 |
> >>| zlib -9  | 67613382 |   22.554 | 3.135 |9.40 | 9.44 | 0.27 |
> >>
> >
> >Theses benchmarks are misleading because they compress the whole file as a
> >single stream without resetting the dictionary, which isn't how data will
> >typically be compressed in kernel mode.  With filesystem compression the data
> >has to be divided into small chunks that can each be decompressed 
> >independently.
> >That eliminates one of the primary advantages of Zstandard (support for large
> >dictionary sizes).
> 
> I did btrfs benchmarks of kernel trees and other normal data sets as
> well.  The numbers were in line with what Nick is posting here.
> zstd is a big win over both lzo and zlib from a btrfs point of view.
> 
> It's true Nick's patches only support a single compression level in
> btrfs, but that's because btrfs doesn't have a way to pass in the
> compression ratio.  It could easily be a mount option, it was just
> outside the scope of Nick's initial work.
> 

I am not surprised --- Zstandard is closer to the state of the art, both
format-wise and implementation-wise, than the other choices in BTRFS.  My point
is that benchmarks need to account for how much data is compressed at a time.
This is a common mistake when comparing different compression algorithms; the
algorithm name and compression level do not tell the whole story.  The
dictionary size is extremely significant.  No one is going to compress or
decompress a 200 MB file as a single stream in kernel mode, so it does not make
sense to justify adding Zstandard *to the kernel* based on such a benchmark.  It
is going to be divided into chunks.  How big are the chunks in BTRFS?  I thought
that it compressed only one page (4 KiB) at a time, but I hope that has been, or
is being, improved; 32 KiB - 128 KiB should be a better amount.  (And if the
amount of data compressed at a time happens to be different between the
different algorithms, note that BTRFS benchmarks are likely to be measuring that
as much as the algorithms themselves.)

Eric
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH v5 2/5] lib: Add zstd modules

2017-08-10 Thread Austin S. Hemmelgarn

On 2017-08-10 13:24, Eric Biggers wrote:

On Thu, Aug 10, 2017 at 07:32:18AM -0400, Austin S. Hemmelgarn wrote:

On 2017-08-10 04:30, Eric Biggers wrote:

On Wed, Aug 09, 2017 at 07:35:53PM -0700, Nick Terrell wrote:


It can compress at speeds approaching lz4, and quality approaching lzma.


Well, for a very loose definition of "approaching", and certainly not at the
same time.  I doubt there's a use case for using the highest compression levels
in kernel mode --- especially the ones using zstd_opt.h.

Large data-sets with WORM access patterns and infrequent writes
immediately come to mind as a use case for the highest compression
level.

As a more specific example, the company I work for has a very large
amount of documentation, and we keep all old versions.  This is all
stored on a file server which is currently using BTRFS.  Once a
document is written, it's almost never rewritten, so write
performance only matters for the first write.  However, they're read
back pretty frequently, so we need good read performance.  As of
right now, the system is set to use LZO compression by default, and
then when a new document is added, the previous version of that
document gets re-compressed using zlib compression, which actually
results in pretty significant space savings most of the time.  I
would absolutely love to use zstd compression with this system with
the highest compression level, because most people don't care how
long it takes to write the file out, but they do care how long it
takes to read a file (even if it's an older version).


This may be a reasonable use case, but note this cannot just be the regular
"zstd" compression setting, since filesystem compression by default must provide
reasonable performance for many different access patterns.  See the patch in
this series which actually adds zstd compression to btrfs; it only uses level 1.
I do not see a patch which adds a higher compression mode.  It would need to be
a special setting like "zstdhc" that users could opt-in to on specific
directories.  It also would need to be compared to simply compressing in
userspace.  In many cases compressing in userspace is probably the better
solution for the use case in question because it works on any filesystem, allows
using any compression algorithm, and if random access is not needed it is
possible to compress each file as a single stream (like a .xz file), which
produces a much better compression ratio than the block-by-block compression
that filesystems have to use.
There has been discussion as well as (I think) initial patches merged 
for support of specifying the compression level for algorithms which 
support multiple compression levels in BTRFS.  I was actually under the 
impression that we had decided to use level 3 as the default for zstd, 
but that apparently isn't the case, and with the benchmark issues, it 
may not be once proper benchmarks are run.


Also, on the note of compressing in userspace, the use case I quoted at 
least can't do that because we have to deal with Windows clients and 
users have to be able to open files directly on said Windows clients.  I 
entirely agree that real archival storage is better off using userspace 
compression, but sometimes real archival storage isn't an option.


Note also that LZ4HC is in the kernel source tree currently but no one is using
it vs. the regular LZ4.  I think it is the kind of thing that sounded useful
originally, but at the end of the day no one really wants to use it in kernel
mode.  I'd certainly be interested in actual patches, though.
Part of that is the fact that BTRFS is one of the only consumers (AFAIK) 
of this API that can freely choose all aspects of their usage, and the 
consensus here (which I don't agree with I might add) amounts to the 
argument that 'we already have  compression with a  compression 
ratio, we don't need more things like that'.  I would personally love to 
see LZ4HC support in BTRFS (based on testing my own use cases, LZ4 is 
more deterministic than LZO for both compression and decompression, and 
most of the non archival usage I have of BTRFS benefits from 
determinism), but there's not any point in me writing up such a patch 
because it's almost certain to get rejected because BTRFS already has 
LZO.  The main reason that zstd is getting considered at all is that the 
quoted benchmarks show clear benefits in decompression speed relative to 
zlib and far better compression ratios than LZO.

--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH v5 2/5] lib: Add zstd modules

2017-08-10 Thread Chris Mason

On 08/10/2017 04:30 AM, Eric Biggers wrote:

On Wed, Aug 09, 2017 at 07:35:53PM -0700, Nick Terrell wrote:



The memory reported is the amount of memory the compressor requests.

| Method   | Size (B) | Time (s) | Ratio | MB/s| Adj MB/s | Mem (MB) |
|--|--|--|---|-|--|--|
| none | 11988480 |0.100 | 1 | 2119.88 |- |- |
| zstd -1  | 73645762 |1.044 | 2.878 |  203.05 |   224.56 | 1.23 |
| zstd -3  | 66988878 |1.761 | 3.165 |  120.38 |   127.63 | 2.47 |
| zstd -5  | 65001259 |2.563 | 3.261 |   82.71 |86.07 | 2.86 |
| zstd -10 | 60165346 |   13.242 | 3.523 |   16.01 |16.13 |13.22 |
| zstd -15 | 58009756 |   47.601 | 3.654 |4.45 | 4.46 |21.61 |
| zstd -19 | 54014593 |  102.835 | 3.925 |2.06 | 2.06 |60.15 |
| zlib -1  | 77260026 |2.895 | 2.744 |   73.23 |75.85 | 0.27 |
| zlib -3  | 72972206 |4.116 | 2.905 |   51.50 |52.79 | 0.27 |
| zlib -6  | 68190360 |9.633 | 3.109 |   22.01 |22.24 | 0.27 |
| zlib -9  | 67613382 |   22.554 | 3.135 |9.40 | 9.44 | 0.27 |



Theses benchmarks are misleading because they compress the whole file as a
single stream without resetting the dictionary, which isn't how data will
typically be compressed in kernel mode.  With filesystem compression the data
has to be divided into small chunks that can each be decompressed independently.
That eliminates one of the primary advantages of Zstandard (support for large
dictionary sizes).


I did btrfs benchmarks of kernel trees and other normal data sets as 
well.  The numbers were in line with what Nick is posting here.  zstd is 
a big win over both lzo and zlib from a btrfs point of view.


It's true Nick's patches only support a single compression level in 
btrfs, but that's because btrfs doesn't have a way to pass in the 
compression ratio.  It could easily be a mount option, it was just 
outside the scope of Nick's initial work.


-chris



--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH v5 2/5] lib: Add zstd modules

2017-08-10 Thread Eric Biggers
On Thu, Aug 10, 2017 at 10:57:01AM -0400, Austin S. Hemmelgarn wrote:
> Also didn't think to mention this, but I could see the max level
> being very popular for use with SquashFS root filesystems used in
> LiveCD's. Currently, they have to decide between read performance
> and image size, while zstd would provide both.

The high compression levels of Zstandard are indeed a great fit for SquashFS,
but SquashFS images are created in userspace by squashfs-tools.  The kernel only
needs to be able to decompress them.

(Also, while Zstandard provides very good tradeoffs and will probably become the
preferred algorithm for SquashFS, it's misleading to imply that users won't have
to make decisions anymore.  It does not compress as well as XZ or decompress as
fast as LZ4, except maybe in very carefully crafted benchmarks.)

Eric
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH v5 2/5] lib: Add zstd modules

2017-08-10 Thread Eric Biggers
On Thu, Aug 10, 2017 at 07:32:18AM -0400, Austin S. Hemmelgarn wrote:
> On 2017-08-10 04:30, Eric Biggers wrote:
> >On Wed, Aug 09, 2017 at 07:35:53PM -0700, Nick Terrell wrote:
> >>
> >>It can compress at speeds approaching lz4, and quality approaching lzma.
> >
> >Well, for a very loose definition of "approaching", and certainly not at the
> >same time.  I doubt there's a use case for using the highest compression 
> >levels
> >in kernel mode --- especially the ones using zstd_opt.h.
> Large data-sets with WORM access patterns and infrequent writes
> immediately come to mind as a use case for the highest compression
> level.
> 
> As a more specific example, the company I work for has a very large
> amount of documentation, and we keep all old versions.  This is all
> stored on a file server which is currently using BTRFS.  Once a
> document is written, it's almost never rewritten, so write
> performance only matters for the first write.  However, they're read
> back pretty frequently, so we need good read performance.  As of
> right now, the system is set to use LZO compression by default, and
> then when a new document is added, the previous version of that
> document gets re-compressed using zlib compression, which actually
> results in pretty significant space savings most of the time.  I
> would absolutely love to use zstd compression with this system with
> the highest compression level, because most people don't care how
> long it takes to write the file out, but they do care how long it
> takes to read a file (even if it's an older version).

This may be a reasonable use case, but note this cannot just be the regular
"zstd" compression setting, since filesystem compression by default must provide
reasonable performance for many different access patterns.  See the patch in
this series which actually adds zstd compression to btrfs; it only uses level 1.
I do not see a patch which adds a higher compression mode.  It would need to be
a special setting like "zstdhc" that users could opt-in to on specific
directories.  It also would need to be compared to simply compressing in
userspace.  In many cases compressing in userspace is probably the better
solution for the use case in question because it works on any filesystem, allows
using any compression algorithm, and if random access is not needed it is
possible to compress each file as a single stream (like a .xz file), which
produces a much better compression ratio than the block-by-block compression
that filesystems have to use.

Note also that LZ4HC is in the kernel source tree currently but no one is using
it vs. the regular LZ4.  I think it is the kind of thing that sounded useful
originally, but at the end of the day no one really wants to use it in kernel
mode.  I'd certainly be interested in actual patches, though.

Eric
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH v5 2/5] lib: Add zstd modules

2017-08-10 Thread Austin S. Hemmelgarn

On 2017-08-10 07:32, Austin S. Hemmelgarn wrote:

On 2017-08-10 04:30, Eric Biggers wrote:

On Wed, Aug 09, 2017 at 07:35:53PM -0700, Nick Terrell wrote:


It can compress at speeds approaching lz4, and quality approaching lzma.


Well, for a very loose definition of "approaching", and certainly not 
at the
same time.  I doubt there's a use case for using the highest 
compression levels

in kernel mode --- especially the ones using zstd_opt.h.
Large data-sets with WORM access patterns and infrequent writes 
immediately come to mind as a use case for the highest compression level.


As a more specific example, the company I work for has a very large 
amount of documentation, and we keep all old versions.  This is all 
stored on a file server which is currently using BTRFS.  Once a document 
is written, it's almost never rewritten, so write performance only 
matters for the first write.  However, they're read back pretty 
frequently, so we need good read performance.  As of right now, the 
system is set to use LZO compression by default, and then when a new 
document is added, the previous version of that document gets 
re-compressed using zlib compression, which actually results in pretty 
significant space savings most of the time.  I would absolutely love to 
use zstd compression with this system with the highest compression 
level, because most people don't care how long it takes to write the 
file out, but they do care how long it takes to read a file (even if 
it's an older version).
Also didn't think to mention this, but I could see the max level being 
very popular for use with SquashFS root filesystems used in LiveCD's. 
Currently, they have to decide between read performance and image size, 
while zstd would provide both.

--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH v5 2/5] lib: Add zstd modules

2017-08-10 Thread Austin S. Hemmelgarn

On 2017-08-10 04:30, Eric Biggers wrote:

On Wed, Aug 09, 2017 at 07:35:53PM -0700, Nick Terrell wrote:


It can compress at speeds approaching lz4, and quality approaching lzma.


Well, for a very loose definition of "approaching", and certainly not at the
same time.  I doubt there's a use case for using the highest compression levels
in kernel mode --- especially the ones using zstd_opt.h.
Large data-sets with WORM access patterns and infrequent writes 
immediately come to mind as a use case for the highest compression level.


As a more specific example, the company I work for has a very large 
amount of documentation, and we keep all old versions.  This is all 
stored on a file server which is currently using BTRFS.  Once a document 
is written, it's almost never rewritten, so write performance only 
matters for the first write.  However, they're read back pretty 
frequently, so we need good read performance.  As of right now, the 
system is set to use LZO compression by default, and then when a new 
document is added, the previous version of that document gets 
re-compressed using zlib compression, which actually results in pretty 
significant space savings most of the time.  I would absolutely love to 
use zstd compression with this system with the highest compression 
level, because most people don't care how long it takes to write the 
file out, but they do care how long it takes to read a file (even if 
it's an older version).




The code was ported from the upstream zstd source repository.


What version?


`linux/zstd.h` header was modified to match linux kernel style.
The cross-platform and allocation code was stripped out. Instead zstd
requires the caller to pass a preallocated workspace. The source files
were clang-formatted [1] to match the Linux Kernel style as much as
possible.


It would be easier to compare to the upstream version if it was not all
reformatted.  There is a chance that bugs were introduced by Linux-specific
changes, and it would be nice if they could be easily reviewed.  (Also I don't
know what clang-format settings you used, but there are still a lot of
differences from the Linux coding style.)



I benchmarked zstd compression as a special character device. I ran zstd
and zlib compression at several levels, as well as performing no
compression, which measure the time spent copying the data to kernel space.
Data is passed to the compresser 4096 B at a time. The benchmark file is
located in the upstream zstd source repository under
`contrib/linux-kernel/zstd_compress_test.c` [2].

I ran the benchmarks on a Ubuntu 14.04 VM with 2 cores and 4 GiB of RAM.
The VM is running on a MacBook Pro with a 3.1 GHz Intel Core i7 processor,
16 GB of RAM, and a SSD. I benchmarked using `silesia.tar` [3], which is
211,988,480 B large. Run the following commands for the benchmark:

 sudo modprobe zstd_compress_test
 sudo mknod zstd_compress_test c 245 0
 sudo cp silesia.tar zstd_compress_test

The time is reported by the time of the userland `cp`.
The MB/s is computed with

 1,536,217,008 B / time(buffer size, hash)

which includes the time to copy from userland.
The Adjusted MB/s is computed with

 1,536,217,088 B / (time(buffer size, hash) - time(buffer size, none)).

The memory reported is the amount of memory the compressor requests.

| Method   | Size (B) | Time (s) | Ratio | MB/s| Adj MB/s | Mem (MB) |
|--|--|--|---|-|--|--|
| none | 11988480 |0.100 | 1 | 2119.88 |- |- |
| zstd -1  | 73645762 |1.044 | 2.878 |  203.05 |   224.56 | 1.23 |
| zstd -3  | 66988878 |1.761 | 3.165 |  120.38 |   127.63 | 2.47 |
| zstd -5  | 65001259 |2.563 | 3.261 |   82.71 |86.07 | 2.86 |
| zstd -10 | 60165346 |   13.242 | 3.523 |   16.01 |16.13 |13.22 |
| zstd -15 | 58009756 |   47.601 | 3.654 |4.45 | 4.46 |21.61 |
| zstd -19 | 54014593 |  102.835 | 3.925 |2.06 | 2.06 |60.15 |
| zlib -1  | 77260026 |2.895 | 2.744 |   73.23 |75.85 | 0.27 |
| zlib -3  | 72972206 |4.116 | 2.905 |   51.50 |52.79 | 0.27 |
| zlib -6  | 68190360 |9.633 | 3.109 |   22.01 |22.24 | 0.27 |
| zlib -9  | 67613382 |   22.554 | 3.135 |9.40 | 9.44 | 0.27 |



Theses benchmarks are misleading because they compress the whole file as a
single stream without resetting the dictionary, which isn't how data will
typically be compressed in kernel mode.  With filesystem compression the data
has to be divided into small chunks that can each be decompressed independently.
That eliminates one of the primary advantages of Zstandard (support for large
dictionary sizes).

--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH v5 2/5] lib: Add zstd modules

2017-08-10 Thread Eric Biggers
On Wed, Aug 09, 2017 at 07:35:53PM -0700, Nick Terrell wrote:
>
> It can compress at speeds approaching lz4, and quality approaching lzma.

Well, for a very loose definition of "approaching", and certainly not at the
same time.  I doubt there's a use case for using the highest compression levels
in kernel mode --- especially the ones using zstd_opt.h.

> 
> The code was ported from the upstream zstd source repository.

What version?

> `linux/zstd.h` header was modified to match linux kernel style.
> The cross-platform and allocation code was stripped out. Instead zstd
> requires the caller to pass a preallocated workspace. The source files
> were clang-formatted [1] to match the Linux Kernel style as much as
> possible. 

It would be easier to compare to the upstream version if it was not all
reformatted.  There is a chance that bugs were introduced by Linux-specific
changes, and it would be nice if they could be easily reviewed.  (Also I don't
know what clang-format settings you used, but there are still a lot of
differences from the Linux coding style.)

> 
> I benchmarked zstd compression as a special character device. I ran zstd
> and zlib compression at several levels, as well as performing no
> compression, which measure the time spent copying the data to kernel space.
> Data is passed to the compresser 4096 B at a time. The benchmark file is
> located in the upstream zstd source repository under
> `contrib/linux-kernel/zstd_compress_test.c` [2].
> 
> I ran the benchmarks on a Ubuntu 14.04 VM with 2 cores and 4 GiB of RAM.
> The VM is running on a MacBook Pro with a 3.1 GHz Intel Core i7 processor,
> 16 GB of RAM, and a SSD. I benchmarked using `silesia.tar` [3], which is
> 211,988,480 B large. Run the following commands for the benchmark:
> 
> sudo modprobe zstd_compress_test
> sudo mknod zstd_compress_test c 245 0
> sudo cp silesia.tar zstd_compress_test
> 
> The time is reported by the time of the userland `cp`.
> The MB/s is computed with
> 
> 1,536,217,008 B / time(buffer size, hash)
> 
> which includes the time to copy from userland.
> The Adjusted MB/s is computed with
> 
> 1,536,217,088 B / (time(buffer size, hash) - time(buffer size, none)).
> 
> The memory reported is the amount of memory the compressor requests.
> 
> | Method   | Size (B) | Time (s) | Ratio | MB/s| Adj MB/s | Mem (MB) |
> |--|--|--|---|-|--|--|
> | none | 11988480 |0.100 | 1 | 2119.88 |- |- |
> | zstd -1  | 73645762 |1.044 | 2.878 |  203.05 |   224.56 | 1.23 |
> | zstd -3  | 66988878 |1.761 | 3.165 |  120.38 |   127.63 | 2.47 |
> | zstd -5  | 65001259 |2.563 | 3.261 |   82.71 |86.07 | 2.86 |
> | zstd -10 | 60165346 |   13.242 | 3.523 |   16.01 |16.13 |13.22 |
> | zstd -15 | 58009756 |   47.601 | 3.654 |4.45 | 4.46 |21.61 |
> | zstd -19 | 54014593 |  102.835 | 3.925 |2.06 | 2.06 |60.15 |
> | zlib -1  | 77260026 |2.895 | 2.744 |   73.23 |75.85 | 0.27 |
> | zlib -3  | 72972206 |4.116 | 2.905 |   51.50 |52.79 | 0.27 |
> | zlib -6  | 68190360 |9.633 | 3.109 |   22.01 |22.24 | 0.27 |
> | zlib -9  | 67613382 |   22.554 | 3.135 |9.40 | 9.44 | 0.27 |
> 

Theses benchmarks are misleading because they compress the whole file as a
single stream without resetting the dictionary, which isn't how data will
typically be compressed in kernel mode.  With filesystem compression the data
has to be divided into small chunks that can each be decompressed independently.
That eliminates one of the primary advantages of Zstandard (support for large
dictionary sizes).

Eric
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html