Re: [PATCH v2] btrfs: add zstd compression level support

2018-11-19 Thread Omar Sandoval
On Tue, Nov 13, 2018 at 01:33:32AM +0100, David Sterba wrote:
> On Wed, Oct 31, 2018 at 11:11:08AM -0700, Nick Terrell wrote:
> > From: Jennifer Liu 
> > 
> > Adds zstd compression level support to btrfs. Zstd requires
> > different amounts of memory for each level, so the design had
> > to be modified to allow set_level() to allocate memory. We
> > preallocate one workspace of the maximum size to guarantee
> > forward progress. This feature is expected to be useful for
> > read-mostly filesystems, or when creating images.
> > 
> > Benchmarks run in qemu on Intel x86 with a single core.
> > The benchmark measures the time to copy the Silesia corpus [0] to
> > a btrfs filesystem 10 times, then read it back.
> > 
> > The two important things to note are:
> > - The decompression speed and memory remains constant.
> >   The memory required to decompress is the same as level 1.
> > - The compression speed and ratio will vary based on the source.
> > 
> > Level   Ratio   Compression Decompression   Compression Memory
> > 1   2.59153 MB/s112 MB/s0.8 MB
> > 2   2.67136 MB/s113 MB/s1.0 MB
> > 3   2.72106 MB/s115 MB/s1.3 MB
> > 4   2.7886  MB/s109 MB/s0.9 MB
> > 5   2.8369  MB/s109 MB/s1.4 MB
> > 6   2.8953  MB/s110 MB/s1.5 MB
> > 7   2.9140  MB/s112 MB/s1.4 MB
> > 8   2.9234  MB/s110 MB/s1.8 MB
> > 9   2.9327  MB/s109 MB/s1.8 MB
> > 10  2.9422  MB/s109 MB/s1.8 MB
> > 11  2.9517  MB/s114 MB/s1.8 MB
> > 12  2.9513  MB/s113 MB/s1.8 MB
> > 13  2.9510  MB/s111 MB/s2.3 MB
> > 14  2.997   MB/s110 MB/s2.6 MB
> > 15  3.036   MB/s110 MB/s2.6 MB
> > 
> > [0] http://sun.aei.polsl.pl/~sdeor/index.php?page=silesia
> > 
> > Signed-off-by: Jennifer Liu 
> > Signed-off-by: Nick Terrell 
> > Reviewed-by: Omar Sandoval 
> > ---
> > v1 -> v2:
> > - Don't reflow the unchanged line.
> > 

[snip]

> > -static struct list_head *zstd_alloc_workspace(void)
> > +static bool zstd_set_level(struct list_head *ws, unsigned int level)
> > +{
> > +   struct workspace *workspace = list_entry(ws, struct workspace, list);
> > +   ZSTD_parameters params;
> > +   int size;
> > +
> > +   if (level > BTRFS_ZSTD_MAX_LEVEL)
> > +   level = BTRFS_ZSTD_MAX_LEVEL;
> > +
> > +   if (level == 0)
> > +   level = BTRFS_ZSTD_DEFAULT_LEVEL;
> > +
> > +   params = ZSTD_getParams(level, ZSTD_BTRFS_MAX_INPUT, 0);
> > +   size = max_t(size_t,
> > +   ZSTD_CStreamWorkspaceBound(params.cParams),
> > +   ZSTD_DStreamWorkspaceBound(ZSTD_BTRFS_MAX_INPUT));
> > +   if (size > workspace->size) {
> > +   if (!zstd_reallocate_mem(workspace, size))
> 
> This can allocate memory and this can appen on the writeout path, ie.
> one of the reasons for that might be that system needs more memory.
> 
> By the table above, the size can be up to 2.6MiB, which is a lot in
> terms of kernel memory as there must be either contiguous unmapped
> memory, the virtual mappings must be created. Both are scarce resource
> or should be treated as such.
> 
> Given that there's no logic that would try to minimize the usage for
> workspaces, this can allocate many workspaces of that size.
> 
> Currently the workspace allocations have been moved to the early module
> loading phase so that they don't happen later and we don't have to
> allocate memory nor handle the failures. Your patch brings that back.

Even before this patch, we may try to allocate a workspace. See
__find_workspace():

https://github.com/kdave/btrfs-devel/blob/fd0f5617a8a2ee92dd461d01cf9c5c37363ccc8d/fs/btrfs/compression.c#L897

We already limit it to one per CPU, and only allocate when needed.
Anything greater than that has to wait. Maybe we should improve that to
also include a limit on the total amount of memory allocated? That would
be more flexible than your approach below of making the > default case
special, and I like it more than Timofey's idea of falling back to a
lower level.

> The solution I'm currently thinking about can make the levels work but
> would be limited in throughput as a trade-off for the memory
> consumption.
> 
> - preallocate one workspace for level 15 per mounted filesystem, using

Re: [PATCH v2] btrfs: add zstd compression level support

2018-11-19 Thread Omar Sandoval
On Tue, Nov 13, 2018 at 04:29:33PM +0300, Timofey Titovets wrote:
> вт, 13 нояб. 2018 г. в 04:52, Nick Terrell :
> >
> >
> >
> > > On Nov 12, 2018, at 4:33 PM, David Sterba  wrote:
> > >
> > > On Wed, Oct 31, 2018 at 11:11:08AM -0700, Nick Terrell wrote:
> > >> From: Jennifer Liu 
> > >>
> > >> Adds zstd compression level support to btrfs. Zstd requires
> > >> different amounts of memory for each level, so the design had
> > >> to be modified to allow set_level() to allocate memory. We
> > >> preallocate one workspace of the maximum size to guarantee
> > >> forward progress. This feature is expected to be useful for
> > >> read-mostly filesystems, or when creating images.
> > >>
> > >> Benchmarks run in qemu on Intel x86 with a single core.
> > >> The benchmark measures the time to copy the Silesia corpus [0] to
> > >> a btrfs filesystem 10 times, then read it back.
> > >>
> > >> The two important things to note are:
> > >> - The decompression speed and memory remains constant.
> > >>  The memory required to decompress is the same as level 1.
> > >> - The compression speed and ratio will vary based on the source.
> > >>
> > >> LevelRatio   Compression Decompression   Compression Memory
> > >> 12.59153 MB/s112 MB/s0.8 MB
> > >> 22.67136 MB/s113 MB/s1.0 MB
> > >> 32.72106 MB/s115 MB/s1.3 MB
> > >> 42.7886  MB/s109 MB/s0.9 MB
> > >> 52.8369  MB/s109 MB/s1.4 MB
> > >> 62.8953  MB/s110 MB/s1.5 MB
> > >> 72.9140  MB/s112 MB/s1.4 MB
> > >> 82.9234  MB/s110 MB/s1.8 MB
> > >> 92.9327  MB/s109 MB/s1.8 MB
> > >> 10   2.9422  MB/s109 MB/s1.8 MB
> > >> 11   2.9517  MB/s114 MB/s1.8 MB
> > >> 12   2.9513  MB/s113 MB/s1.8 MB
> > >> 13   2.9510  MB/s111 MB/s2.3 MB
> > >> 14   2.997   MB/s110 MB/s2.6 MB
> > >> 15   3.036   MB/s110 MB/s2.6 MB
> > >>
> > >> [0] 
> > >> https://urldefense.proofpoint.com/v2/url?u=http-3A__sun.aei.polsl.pl_-7Esdeor_index.php-3Fpage-3Dsilesia=DwIBAg=5VD0RTtNlTh3ycd41b3MUw=HQM5IQdWOB8WaMoii2dYTw=5LQRTUqZnx_a8dGSa5bGsd0Fm4ejQQOcH50wi7nRewY=gFUm-SA3aeQI7PBe3zmxUuxk4AEEZegB0cRsbjWUToo=
> > >>
> > >> Signed-off-by: Jennifer Liu 
> > >> Signed-off-by: Nick Terrell 
> > >> Reviewed-by: Omar Sandoval 
> > >> ---
> > >> v1 -> v2:
> > >> - Don't reflow the unchanged line.
> > >>
> > >> fs/btrfs/compression.c | 169 +
> > >> fs/btrfs/compression.h |  18 +++--
> > >> fs/btrfs/lzo.c |   5 +-
> > >> fs/btrfs/super.c   |   7 +-
> > >> fs/btrfs/zlib.c|  33 
> > >> fs/btrfs/zstd.c|  74 +-
> > >> 6 files changed, 202 insertions(+), 104 deletions(-)
> > >>
> > >> diff --git a/fs/btrfs/compression.c b/fs/btrfs/compression.c
> > >> index 2955a4ea2fa8..b46652cb653e 100644
> > >> --- a/fs/btrfs/compression.c
> > >> +++ b/fs/btrfs/compression.c
> > >> @@ -822,9 +822,12 @@ void __init btrfs_init_compress(void)
> > >>
> > >>  /*
> > >>   * Preallocate one workspace for each compression type so
> > >> - * we can guarantee forward progress in the worst case
> > >> + * we can guarantee forward progress in the worst case.
> > >> + * Provide the maximum compression level to guarantee large
> > >> + * enough workspace.
> > >>   */
> > >> -workspace = btrfs_compress_op[i]->alloc_workspace();
> > >> +workspace = btrfs_compress_op[i]->alloc_workspace(
> > >> +btrfs_compress_op[i]->max_level);
> >
> > We provide the max level here, so we have at least one workspace per
> > compression type that is large enough.
> >
> > >> 

Re: [PATCH v2] btrfs: add zstd compression level support

2018-11-15 Thread Nick Terrell


> On Nov 13, 2018, at 5:29 AM, Timofey Titovets  wrote:
> 
> вт, 13 нояб. 2018 г. в 04:52, Nick Terrell :
>> 
>> 
>> 
>>> On Nov 12, 2018, at 4:33 PM, David Sterba  wrote:
>>> 
>>> On Wed, Oct 31, 2018 at 11:11:08AM -0700, Nick Terrell wrote:
>>>> From: Jennifer Liu 
>>>> 
>>>> Adds zstd compression level support to btrfs. Zstd requires
>>>> different amounts of memory for each level, so the design had
>>>> to be modified to allow set_level() to allocate memory. We
>>>> preallocate one workspace of the maximum size to guarantee
>>>> forward progress. This feature is expected to be useful for
>>>> read-mostly filesystems, or when creating images.
>>>> 
>>>> Benchmarks run in qemu on Intel x86 with a single core.
>>>> The benchmark measures the time to copy the Silesia corpus [0] to
>>>> a btrfs filesystem 10 times, then read it back.
>>>> 
>>>> The two important things to note are:
>>>> - The decompression speed and memory remains constant.
>>>> The memory required to decompress is the same as level 1.
>>>> - The compression speed and ratio will vary based on the source.
>>>> 
>>>> LevelRatio   Compression Decompression   Compression Memory
>>>> 12.59153 MB/s112 MB/s0.8 MB
>>>> 22.67136 MB/s113 MB/s1.0 MB
>>>> 32.72106 MB/s115 MB/s1.3 MB
>>>> 42.7886  MB/s109 MB/s0.9 MB
>>>> 52.8369  MB/s109 MB/s1.4 MB
>>>> 62.8953  MB/s110 MB/s1.5 MB
>>>> 72.9140  MB/s112 MB/s1.4 MB
>>>> 82.9234  MB/s110 MB/s1.8 MB
>>>> 92.9327  MB/s109 MB/s1.8 MB
>>>> 10   2.9422  MB/s109 MB/s1.8 MB
>>>> 11   2.9517  MB/s114 MB/s1.8 MB
>>>> 12   2.9513  MB/s113 MB/s1.8 MB
>>>> 13   2.9510  MB/s111 MB/s2.3 MB
>>>> 14   2.997   MB/s110 MB/s2.6 MB
>>>> 15   3.036   MB/s110 MB/s2.6 MB
>>>> 
>>>> [0] 
>>>> https://urldefense.proofpoint.com/v2/url?u=http-3A__sun.aei.polsl.pl_-7Esdeor_index.php-3Fpage-3Dsilesia=DwIBAg=5VD0RTtNlTh3ycd41b3MUw=HQM5IQdWOB8WaMoii2dYTw=5LQRTUqZnx_a8dGSa5bGsd0Fm4ejQQOcH50wi7nRewY=gFUm-SA3aeQI7PBe3zmxUuxk4AEEZegB0cRsbjWUToo=
>>>> 
>>>> Signed-off-by: Jennifer Liu 
>>>> Signed-off-by: Nick Terrell 
>>>> Reviewed-by: Omar Sandoval 
>>>> ---
>>>> v1 -> v2:
>>>> - Don't reflow the unchanged line.
>>>> 
>>>> fs/btrfs/compression.c | 169 +
>>>> fs/btrfs/compression.h |  18 +++--
>>>> fs/btrfs/lzo.c |   5 +-
>>>> fs/btrfs/super.c   |   7 +-
>>>> fs/btrfs/zlib.c|  33 
>>>> fs/btrfs/zstd.c|  74 +-
>>>> 6 files changed, 202 insertions(+), 104 deletions(-)
>>>> 
>>>> diff --git a/fs/btrfs/compression.c b/fs/btrfs/compression.c
>>>> index 2955a4ea2fa8..b46652cb653e 100644
>>>> --- a/fs/btrfs/compression.c
>>>> +++ b/fs/btrfs/compression.c
>>>> @@ -822,9 +822,12 @@ void __init btrfs_init_compress(void)
>>>> 
>>>> /*
>>>>  * Preallocate one workspace for each compression type so
>>>> - * we can guarantee forward progress in the worst case
>>>> + * we can guarantee forward progress in the worst case.
>>>> + * Provide the maximum compression level to guarantee large
>>>> + * enough workspace.
>>>>  */
>>>> -workspace = btrfs_compress_op[i]->alloc_workspace();
>>>> +workspace = btrfs_compress_op[i]->alloc_workspace(
>>>> +btrfs_compress_op[i]->max_level);
>> 
>> We provide the max level here, so we have at least one workspace per
>> compression type that is large enough.
>> 
>>>> if (IS_ERR(workspace)) {
>>>> pr_warn("BTRFS: cannot preallocate compression 
>>>> w

Re: [PATCH v2] btrfs: add zstd compression level support

2018-11-13 Thread Timofey Titovets
вт, 13 нояб. 2018 г. в 04:52, Nick Terrell :
>
>
>
> > On Nov 12, 2018, at 4:33 PM, David Sterba  wrote:
> >
> > On Wed, Oct 31, 2018 at 11:11:08AM -0700, Nick Terrell wrote:
> >> From: Jennifer Liu 
> >>
> >> Adds zstd compression level support to btrfs. Zstd requires
> >> different amounts of memory for each level, so the design had
> >> to be modified to allow set_level() to allocate memory. We
> >> preallocate one workspace of the maximum size to guarantee
> >> forward progress. This feature is expected to be useful for
> >> read-mostly filesystems, or when creating images.
> >>
> >> Benchmarks run in qemu on Intel x86 with a single core.
> >> The benchmark measures the time to copy the Silesia corpus [0] to
> >> a btrfs filesystem 10 times, then read it back.
> >>
> >> The two important things to note are:
> >> - The decompression speed and memory remains constant.
> >>  The memory required to decompress is the same as level 1.
> >> - The compression speed and ratio will vary based on the source.
> >>
> >> LevelRatio   Compression Decompression   Compression Memory
> >> 12.59153 MB/s112 MB/s0.8 MB
> >> 22.67136 MB/s113 MB/s1.0 MB
> >> 32.72106 MB/s115 MB/s1.3 MB
> >> 42.7886  MB/s109 MB/s0.9 MB
> >> 52.8369  MB/s109 MB/s1.4 MB
> >> 62.8953  MB/s110 MB/s1.5 MB
> >> 72.9140  MB/s112 MB/s1.4 MB
> >> 82.9234  MB/s110 MB/s1.8 MB
> >> 92.9327  MB/s109 MB/s1.8 MB
> >> 10   2.9422  MB/s109 MB/s1.8 MB
> >> 11   2.9517  MB/s114 MB/s1.8 MB
> >> 12   2.9513  MB/s113 MB/s1.8 MB
> >> 13   2.9510  MB/s111 MB/s2.3 MB
> >> 14   2.997   MB/s110 MB/s2.6 MB
> >> 15   3.036   MB/s110 MB/s2.6 MB
> >>
> >> [0] 
> >> https://urldefense.proofpoint.com/v2/url?u=http-3A__sun.aei.polsl.pl_-7Esdeor_index.php-3Fpage-3Dsilesia=DwIBAg=5VD0RTtNlTh3ycd41b3MUw=HQM5IQdWOB8WaMoii2dYTw=5LQRTUqZnx_a8dGSa5bGsd0Fm4ejQQOcH50wi7nRewY=gFUm-SA3aeQI7PBe3zmxUuxk4AEEZegB0cRsbjWUToo=
> >>
> >> Signed-off-by: Jennifer Liu 
> >> Signed-off-by: Nick Terrell 
> >> Reviewed-by: Omar Sandoval 
> >> ---
> >> v1 -> v2:
> >> - Don't reflow the unchanged line.
> >>
> >> fs/btrfs/compression.c | 169 +
> >> fs/btrfs/compression.h |  18 +++--
> >> fs/btrfs/lzo.c |   5 +-
> >> fs/btrfs/super.c   |   7 +-
> >> fs/btrfs/zlib.c|  33 
> >> fs/btrfs/zstd.c|  74 +-
> >> 6 files changed, 202 insertions(+), 104 deletions(-)
> >>
> >> diff --git a/fs/btrfs/compression.c b/fs/btrfs/compression.c
> >> index 2955a4ea2fa8..b46652cb653e 100644
> >> --- a/fs/btrfs/compression.c
> >> +++ b/fs/btrfs/compression.c
> >> @@ -822,9 +822,12 @@ void __init btrfs_init_compress(void)
> >>
> >>  /*
> >>   * Preallocate one workspace for each compression type so
> >> - * we can guarantee forward progress in the worst case
> >> + * we can guarantee forward progress in the worst case.
> >> + * Provide the maximum compression level to guarantee large
> >> + * enough workspace.
> >>   */
> >> -workspace = btrfs_compress_op[i]->alloc_workspace();
> >> +workspace = btrfs_compress_op[i]->alloc_workspace(
> >> +btrfs_compress_op[i]->max_level);
>
> We provide the max level here, so we have at least one workspace per
> compression type that is large enough.
>
> >>  if (IS_ERR(workspace)) {
> >>  pr_warn("BTRFS: cannot preallocate compression 
> >> workspace, will try later\n");
> >>  } else {
> >> @@ -835,23 +838,78 @@ void __init btrfs_init_compress(void)
> >>  }
> >> }
> >>
> >> +/*
> >> + * put a workspace struct back on the list or free it if we have enough
> >> + * idle ones sitting around
> &g

Re: [PATCH v2] btrfs: add zstd compression level support

2018-11-12 Thread Nick Terrell



> On Nov 12, 2018, at 4:33 PM, David Sterba  wrote:
> 
> On Wed, Oct 31, 2018 at 11:11:08AM -0700, Nick Terrell wrote:
>> From: Jennifer Liu 
>> 
>> Adds zstd compression level support to btrfs. Zstd requires
>> different amounts of memory for each level, so the design had
>> to be modified to allow set_level() to allocate memory. We
>> preallocate one workspace of the maximum size to guarantee
>> forward progress. This feature is expected to be useful for
>> read-mostly filesystems, or when creating images.
>> 
>> Benchmarks run in qemu on Intel x86 with a single core.
>> The benchmark measures the time to copy the Silesia corpus [0] to
>> a btrfs filesystem 10 times, then read it back.
>> 
>> The two important things to note are:
>> - The decompression speed and memory remains constant.
>>  The memory required to decompress is the same as level 1.
>> - The compression speed and ratio will vary based on the source.
>> 
>> LevelRatio   Compression Decompression   Compression Memory
>> 12.59153 MB/s112 MB/s0.8 MB
>> 22.67136 MB/s113 MB/s1.0 MB
>> 32.72106 MB/s115 MB/s1.3 MB
>> 42.7886  MB/s109 MB/s0.9 MB
>> 52.8369  MB/s109 MB/s1.4 MB
>> 62.8953  MB/s110 MB/s1.5 MB
>> 72.9140  MB/s112 MB/s1.4 MB
>> 82.9234  MB/s110 MB/s1.8 MB
>> 92.9327  MB/s109 MB/s1.8 MB
>> 10   2.9422  MB/s109 MB/s1.8 MB
>> 11   2.9517  MB/s114 MB/s1.8 MB
>> 12   2.9513  MB/s113 MB/s1.8 MB
>> 13   2.9510  MB/s111 MB/s2.3 MB
>> 14   2.997   MB/s110 MB/s2.6 MB
>> 15   3.036   MB/s110 MB/s2.6 MB
>> 
>> [0] 
>> https://urldefense.proofpoint.com/v2/url?u=http-3A__sun.aei.polsl.pl_-7Esdeor_index.php-3Fpage-3Dsilesia=DwIBAg=5VD0RTtNlTh3ycd41b3MUw=HQM5IQdWOB8WaMoii2dYTw=5LQRTUqZnx_a8dGSa5bGsd0Fm4ejQQOcH50wi7nRewY=gFUm-SA3aeQI7PBe3zmxUuxk4AEEZegB0cRsbjWUToo=
>> 
>> Signed-off-by: Jennifer Liu 
>> Signed-off-by: Nick Terrell 
>> Reviewed-by: Omar Sandoval 
>> ---
>> v1 -> v2:
>> - Don't reflow the unchanged line.
>> 
>> fs/btrfs/compression.c | 169 +
>> fs/btrfs/compression.h |  18 +++--
>> fs/btrfs/lzo.c |   5 +-
>> fs/btrfs/super.c   |   7 +-
>> fs/btrfs/zlib.c|  33 
>> fs/btrfs/zstd.c|  74 +-
>> 6 files changed, 202 insertions(+), 104 deletions(-)
>> 
>> diff --git a/fs/btrfs/compression.c b/fs/btrfs/compression.c
>> index 2955a4ea2fa8..b46652cb653e 100644
>> --- a/fs/btrfs/compression.c
>> +++ b/fs/btrfs/compression.c
>> @@ -822,9 +822,12 @@ void __init btrfs_init_compress(void)
>> 
>>  /*
>>   * Preallocate one workspace for each compression type so
>> - * we can guarantee forward progress in the worst case
>> + * we can guarantee forward progress in the worst case.
>> + * Provide the maximum compression level to guarantee large
>> + * enough workspace.
>>   */
>> -workspace = btrfs_compress_op[i]->alloc_workspace();
>> +workspace = btrfs_compress_op[i]->alloc_workspace(
>> +btrfs_compress_op[i]->max_level);

We provide the max level here, so we have at least one workspace per
compression type that is large enough.

>>  if (IS_ERR(workspace)) {
>>  pr_warn("BTRFS: cannot preallocate compression 
>> workspace, will try later\n");
>>  } else {
>> @@ -835,23 +838,78 @@ void __init btrfs_init_compress(void)
>>  }
>> }
>> 
>> +/*
>> + * put a workspace struct back on the list or free it if we have enough
>> + * idle ones sitting around
>> + */
>> +static void __free_workspace(int type, struct list_head *workspace,
>> + bool heuristic)
>> +{
>> +int idx = type - 1;
>> +struct list_head *idle_ws;
>> +spinlock_t *ws_lock;
>> +atomic_t *total_ws;
>> +wait_queue_head_t *ws_wait;
>> +int *free_ws;
>> +
>> +if (heuristic) {
>> +idle_ws  = _heuristic_ws.idle_ws;
>> + 

Re: [PATCH v2] btrfs: add zstd compression level support

2018-11-12 Thread David Sterba
On Wed, Oct 31, 2018 at 11:11:08AM -0700, Nick Terrell wrote:
> From: Jennifer Liu 
> 
> Adds zstd compression level support to btrfs. Zstd requires
> different amounts of memory for each level, so the design had
> to be modified to allow set_level() to allocate memory. We
> preallocate one workspace of the maximum size to guarantee
> forward progress. This feature is expected to be useful for
> read-mostly filesystems, or when creating images.
> 
> Benchmarks run in qemu on Intel x86 with a single core.
> The benchmark measures the time to copy the Silesia corpus [0] to
> a btrfs filesystem 10 times, then read it back.
> 
> The two important things to note are:
> - The decompression speed and memory remains constant.
>   The memory required to decompress is the same as level 1.
> - The compression speed and ratio will vary based on the source.
> 
> Level Ratio   Compression Decompression   Compression Memory
> 1 2.59153 MB/s112 MB/s0.8 MB
> 2 2.67136 MB/s113 MB/s1.0 MB
> 3 2.72106 MB/s115 MB/s1.3 MB
> 4 2.7886  MB/s109 MB/s0.9 MB
> 5 2.8369  MB/s109 MB/s1.4 MB
> 6 2.8953  MB/s110 MB/s1.5 MB
> 7 2.9140  MB/s112 MB/s1.4 MB
> 8 2.9234  MB/s110 MB/s1.8 MB
> 9 2.9327  MB/s109 MB/s1.8 MB
> 102.9422  MB/s109 MB/s1.8 MB
> 112.9517  MB/s114 MB/s1.8 MB
> 122.9513  MB/s113 MB/s1.8 MB
> 132.9510  MB/s111 MB/s2.3 MB
> 142.997   MB/s110 MB/s2.6 MB
> 153.036   MB/s110 MB/s2.6 MB
> 
> [0] http://sun.aei.polsl.pl/~sdeor/index.php?page=silesia
> 
> Signed-off-by: Jennifer Liu 
> Signed-off-by: Nick Terrell 
> Reviewed-by: Omar Sandoval 
> ---
> v1 -> v2:
> - Don't reflow the unchanged line.
> 
>  fs/btrfs/compression.c | 169 +
>  fs/btrfs/compression.h |  18 +++--
>  fs/btrfs/lzo.c |   5 +-
>  fs/btrfs/super.c   |   7 +-
>  fs/btrfs/zlib.c|  33 
>  fs/btrfs/zstd.c|  74 +-
>  6 files changed, 202 insertions(+), 104 deletions(-)
> 
> diff --git a/fs/btrfs/compression.c b/fs/btrfs/compression.c
> index 2955a4ea2fa8..b46652cb653e 100644
> --- a/fs/btrfs/compression.c
> +++ b/fs/btrfs/compression.c
> @@ -822,9 +822,12 @@ void __init btrfs_init_compress(void)
> 
>   /*
>* Preallocate one workspace for each compression type so
> -  * we can guarantee forward progress in the worst case
> +  * we can guarantee forward progress in the worst case.
> +  * Provide the maximum compression level to guarantee large
> +  * enough workspace.
>*/
> - workspace = btrfs_compress_op[i]->alloc_workspace();
> + workspace = btrfs_compress_op[i]->alloc_workspace(
> + btrfs_compress_op[i]->max_level);
>   if (IS_ERR(workspace)) {
>   pr_warn("BTRFS: cannot preallocate compression 
> workspace, will try later\n");
>   } else {
> @@ -835,23 +838,78 @@ void __init btrfs_init_compress(void)
>   }
>  }
> 
> +/*
> + * put a workspace struct back on the list or free it if we have enough
> + * idle ones sitting around
> + */
> +static void __free_workspace(int type, struct list_head *workspace,
> +  bool heuristic)
> +{
> + int idx = type - 1;
> + struct list_head *idle_ws;
> + spinlock_t *ws_lock;
> + atomic_t *total_ws;
> + wait_queue_head_t *ws_wait;
> + int *free_ws;
> +
> + if (heuristic) {
> + idle_ws  = _heuristic_ws.idle_ws;
> + ws_lock  = _heuristic_ws.ws_lock;
> + total_ws = _heuristic_ws.total_ws;
> + ws_wait  = _heuristic_ws.ws_wait;
> + free_ws  = _heuristic_ws.free_ws;
> + } else {
> + idle_ws  = _comp_ws[idx].idle_ws;
> + ws_lock  = _comp_ws[idx].ws_lock;
> + total_ws = _comp_ws[idx].total_ws;
> + ws_wait  = _comp_ws[idx].ws_wait;
> + free_ws  = _comp_ws[idx].free_ws;
> + }
> +
> + spin_lock(ws_lock);
> + if (*free_ws <= num_online_cpus()) {
> + list_add(workspace, idle_ws);
> + (*free_ws)++;
> + spin_unlock(ws_lock);
> + goto wake;
> + }
> + spin_unlock(ws_lock);
> +
> + if (heuristic)
> +   

Re: [PATCH v2] btrfs: add zstd compression level support

2018-11-12 Thread David Sterba
On Wed, Oct 31, 2018 at 11:11:08AM -0700, Nick Terrell wrote:
> From: Jennifer Liu 
> 
> Adds zstd compression level support to btrfs. Zstd requires
> different amounts of memory for each level, so the design had
> to be modified to allow set_level() to allocate memory. We
> preallocate one workspace of the maximum size to guarantee
> forward progress. This feature is expected to be useful for
> read-mostly filesystems, or when creating images.
> 
> Benchmarks run in qemu on Intel x86 with a single core.
> The benchmark measures the time to copy the Silesia corpus [0] to
> a btrfs filesystem 10 times, then read it back.
> 
> The two important things to note are:
> - The decompression speed and memory remains constant.
>   The memory required to decompress is the same as level 1.
> - The compression speed and ratio will vary based on the source.
> 
> Level Ratio   Compression Decompression   Compression Memory
> 1 2.59153 MB/s112 MB/s0.8 MB
> 2 2.67136 MB/s113 MB/s1.0 MB
> 3 2.72106 MB/s115 MB/s1.3 MB
> 4 2.7886  MB/s109 MB/s0.9 MB
> 5 2.8369  MB/s109 MB/s1.4 MB
> 6 2.8953  MB/s110 MB/s1.5 MB
> 7 2.9140  MB/s112 MB/s1.4 MB
> 8 2.9234  MB/s110 MB/s1.8 MB
> 9 2.9327  MB/s109 MB/s1.8 MB
> 102.9422  MB/s109 MB/s1.8 MB
> 112.9517  MB/s114 MB/s1.8 MB
> 122.9513  MB/s113 MB/s1.8 MB
> 132.9510  MB/s111 MB/s2.3 MB
> 142.997   MB/s110 MB/s2.6 MB
> 153.036   MB/s110 MB/s2.6 MB
> 
> [0] http://sun.aei.polsl.pl/~sdeor/index.php?page=silesia
> 
> Signed-off-by: Jennifer Liu 
> Signed-off-by: Nick Terrell 
> Reviewed-by: Omar Sandoval 

Please split the patch to a series where each patch does one logical
thing.


Re: [PATCH v2] btrfs: add zstd compression level support

2018-11-01 Thread Nick Terrell


> On Nov 1, 2018, at 5:57 AM, Timofey Titovets  wrote:
> 
> ср, 31 окт. 2018 г. в 21:12, Nick Terrell :
>> 
>> From: Jennifer Liu 
>> 
>> Adds zstd compression level support to btrfs. Zstd requires
>> different amounts of memory for each level, so the design had
>> to be modified to allow set_level() to allocate memory. We
>> preallocate one workspace of the maximum size to guarantee
>> forward progress. This feature is expected to be useful for
>> read-mostly filesystems, or when creating images.
>> 
>> Benchmarks run in qemu on Intel x86 with a single core.
>> The benchmark measures the time to copy the Silesia corpus [0] to
>> a btrfs filesystem 10 times, then read it back.
>> 
>> The two important things to note are:
>> - The decompression speed and memory remains constant.
>>  The memory required to decompress is the same as level 1.
>> - The compression speed and ratio will vary based on the source.
>> 
>> Level   Ratio   Compression Decompression   Compression Memory
>> 1   2.59153 MB/s112 MB/s0.8 MB
>> 2   2.67136 MB/s113 MB/s1.0 MB
>> 3   2.72106 MB/s115 MB/s1.3 MB
>> 4   2.7886  MB/s109 MB/s0.9 MB
>> 5   2.8369  MB/s109 MB/s1.4 MB
>> 6   2.8953  MB/s110 MB/s1.5 MB
>> 7   2.9140  MB/s112 MB/s1.4 MB
>> 8   2.9234  MB/s110 MB/s1.8 MB
>> 9   2.9327  MB/s109 MB/s1.8 MB
>> 10  2.9422  MB/s109 MB/s1.8 MB
>> 11  2.9517  MB/s114 MB/s1.8 MB
>> 12  2.9513  MB/s113 MB/s1.8 MB
>> 13  2.9510  MB/s111 MB/s2.3 MB
>> 14  2.997   MB/s110 MB/s2.6 MB
>> 15  3.036   MB/s110 MB/s2.6 MB
>> 
>> [0] http://sun.aei.polsl.pl/~sdeor/index.php?page=silesia
>> 
>> Signed-off-by: Jennifer Liu 
>> Signed-off-by: Nick Terrell 
>> Reviewed-by: Omar Sandoval 



> 
> You didn't mention, so:
> Did you test compression ratio/performance with compress-force or just 
> compress?

I tested with compress-force, since I just reused my script from before [0].

I reran some levels with compress and got these numbers:

Level   Ratio   Compression Decompression
1   2.21158 MB/s113 MB/s
3   2.28117 MB/s113 MB/s
5   2.3281  MB/s112 MB/s
7   2.3747  MB/s116 MB/s
15  2.417   MB/s115 MB/s

Using compress probably makes sense with lower levels, to get higher write 
speeds,
but if you're using higher compression levels, you'll likely want to use 
compress-force,
since you likely don't care too much about write speeds. Obviously this will 
depend on
the data you're compressing.

[0] https://gist.github.com/terrelln/51ed3c9da6f94d613c01fcdae60567e8

-Nick

> 
> Thanks.
> 
> -- 
> Have a nice day,
> Timofey.



Re: [PATCH v2] btrfs: add zstd compression level support

2018-11-01 Thread Timofey Titovets
ср, 31 окт. 2018 г. в 21:12, Nick Terrell :
>
> From: Jennifer Liu 
>
> Adds zstd compression level support to btrfs. Zstd requires
> different amounts of memory for each level, so the design had
> to be modified to allow set_level() to allocate memory. We
> preallocate one workspace of the maximum size to guarantee
> forward progress. This feature is expected to be useful for
> read-mostly filesystems, or when creating images.
>
> Benchmarks run in qemu on Intel x86 with a single core.
> The benchmark measures the time to copy the Silesia corpus [0] to
> a btrfs filesystem 10 times, then read it back.
>
> The two important things to note are:
> - The decompression speed and memory remains constant.
>   The memory required to decompress is the same as level 1.
> - The compression speed and ratio will vary based on the source.
>
> Level   Ratio   Compression Decompression   Compression Memory
> 1   2.59153 MB/s112 MB/s0.8 MB
> 2   2.67136 MB/s113 MB/s1.0 MB
> 3   2.72106 MB/s115 MB/s1.3 MB
> 4   2.7886  MB/s109 MB/s0.9 MB
> 5   2.8369  MB/s109 MB/s1.4 MB
> 6   2.8953  MB/s110 MB/s1.5 MB
> 7   2.9140  MB/s112 MB/s1.4 MB
> 8   2.9234  MB/s110 MB/s1.8 MB
> 9   2.9327  MB/s109 MB/s1.8 MB
> 10  2.9422  MB/s109 MB/s1.8 MB
> 11  2.9517  MB/s114 MB/s1.8 MB
> 12  2.9513  MB/s113 MB/s1.8 MB
> 13  2.9510  MB/s111 MB/s2.3 MB
> 14  2.997   MB/s110 MB/s2.6 MB
> 15  3.036   MB/s110 MB/s2.6 MB
>
> [0] http://sun.aei.polsl.pl/~sdeor/index.php?page=silesia
>
> Signed-off-by: Jennifer Liu 
> Signed-off-by: Nick Terrell 
> Reviewed-by: Omar Sandoval 
> ---
> v1 -> v2:
> - Don't reflow the unchanged line.
>
>  fs/btrfs/compression.c | 169 +
>  fs/btrfs/compression.h |  18 +++--
>  fs/btrfs/lzo.c |   5 +-
>  fs/btrfs/super.c   |   7 +-
>  fs/btrfs/zlib.c|  33 
>  fs/btrfs/zstd.c|  74 +-
>  6 files changed, 202 insertions(+), 104 deletions(-)
>
> diff --git a/fs/btrfs/compression.c b/fs/btrfs/compression.c
> index 2955a4ea2fa8..b46652cb653e 100644
> --- a/fs/btrfs/compression.c
> +++ b/fs/btrfs/compression.c
> @@ -822,9 +822,12 @@ void __init btrfs_init_compress(void)
>
> /*
>  * Preallocate one workspace for each compression type so
> -* we can guarantee forward progress in the worst case
> +* we can guarantee forward progress in the worst case.
> +* Provide the maximum compression level to guarantee large
> +* enough workspace.
>  */
> -   workspace = btrfs_compress_op[i]->alloc_workspace();
> +   workspace = btrfs_compress_op[i]->alloc_workspace(
> +   btrfs_compress_op[i]->max_level);
> if (IS_ERR(workspace)) {
> pr_warn("BTRFS: cannot preallocate compression 
> workspace, will try later\n");
> } else {
> @@ -835,23 +838,78 @@ void __init btrfs_init_compress(void)
> }
>  }
>
> +/*
> + * put a workspace struct back on the list or free it if we have enough
> + * idle ones sitting around
> + */
> +static void __free_workspace(int type, struct list_head *workspace,
> +bool heuristic)
> +{
> +   int idx = type - 1;
> +   struct list_head *idle_ws;
> +   spinlock_t *ws_lock;
> +   atomic_t *total_ws;
> +   wait_queue_head_t *ws_wait;
> +   int *free_ws;
> +
> +   if (heuristic) {
> +   idle_ws  = _heuristic_ws.idle_ws;
> +   ws_lock  = _heuristic_ws.ws_lock;
> +   total_ws = _heuristic_ws.total_ws;
> +   ws_wait  = _heuristic_ws.ws_wait;
> +   free_ws  = _heuristic_ws.free_ws;
> +   } else {
> +   idle_ws  = _comp_ws[idx].idle_ws;
> +   ws_lock  = _comp_ws[idx].ws_lock;
> +   total_ws = _comp_ws[idx].total_ws;
> +   ws_wait  = _comp_ws[idx].ws_wait;
> +   free_ws  = _comp_ws[idx].free_ws;
> +   }
> +
> +   spin_lock(ws_lock);
> +   if (*free_ws <= num_online_cpus()) {
> +   list_add(workspace, idle_ws);
> +   (*free_ws)++;
> +   spin_unlock(ws_lock);
> +   g

[PATCH v2] btrfs: add zstd compression level support

2018-10-31 Thread Nick Terrell
From: Jennifer Liu 

Adds zstd compression level support to btrfs. Zstd requires
different amounts of memory for each level, so the design had
to be modified to allow set_level() to allocate memory. We
preallocate one workspace of the maximum size to guarantee
forward progress. This feature is expected to be useful for
read-mostly filesystems, or when creating images.

Benchmarks run in qemu on Intel x86 with a single core.
The benchmark measures the time to copy the Silesia corpus [0] to
a btrfs filesystem 10 times, then read it back.

The two important things to note are:
- The decompression speed and memory remains constant.
  The memory required to decompress is the same as level 1.
- The compression speed and ratio will vary based on the source.

Level   Ratio   Compression Decompression   Compression Memory
1   2.59153 MB/s112 MB/s0.8 MB
2   2.67136 MB/s113 MB/s1.0 MB
3   2.72106 MB/s115 MB/s1.3 MB
4   2.7886  MB/s109 MB/s0.9 MB
5   2.8369  MB/s109 MB/s1.4 MB
6   2.8953  MB/s110 MB/s1.5 MB
7   2.9140  MB/s112 MB/s1.4 MB
8   2.9234  MB/s110 MB/s1.8 MB
9   2.9327  MB/s109 MB/s1.8 MB
10  2.9422  MB/s109 MB/s1.8 MB
11  2.9517  MB/s114 MB/s1.8 MB
12  2.9513  MB/s113 MB/s1.8 MB
13  2.9510  MB/s111 MB/s2.3 MB
14  2.997   MB/s110 MB/s2.6 MB
15  3.036   MB/s110 MB/s2.6 MB

[0] http://sun.aei.polsl.pl/~sdeor/index.php?page=silesia

Signed-off-by: Jennifer Liu 
Signed-off-by: Nick Terrell 
Reviewed-by: Omar Sandoval 
---
v1 -> v2:
- Don't reflow the unchanged line.

 fs/btrfs/compression.c | 169 +
 fs/btrfs/compression.h |  18 +++--
 fs/btrfs/lzo.c |   5 +-
 fs/btrfs/super.c   |   7 +-
 fs/btrfs/zlib.c|  33 
 fs/btrfs/zstd.c|  74 +-
 6 files changed, 202 insertions(+), 104 deletions(-)

diff --git a/fs/btrfs/compression.c b/fs/btrfs/compression.c
index 2955a4ea2fa8..b46652cb653e 100644
--- a/fs/btrfs/compression.c
+++ b/fs/btrfs/compression.c
@@ -822,9 +822,12 @@ void __init btrfs_init_compress(void)

/*
 * Preallocate one workspace for each compression type so
-* we can guarantee forward progress in the worst case
+* we can guarantee forward progress in the worst case.
+* Provide the maximum compression level to guarantee large
+* enough workspace.
 */
-   workspace = btrfs_compress_op[i]->alloc_workspace();
+   workspace = btrfs_compress_op[i]->alloc_workspace(
+   btrfs_compress_op[i]->max_level);
if (IS_ERR(workspace)) {
pr_warn("BTRFS: cannot preallocate compression 
workspace, will try later\n");
} else {
@@ -835,23 +838,78 @@ void __init btrfs_init_compress(void)
}
 }

+/*
+ * put a workspace struct back on the list or free it if we have enough
+ * idle ones sitting around
+ */
+static void __free_workspace(int type, struct list_head *workspace,
+bool heuristic)
+{
+   int idx = type - 1;
+   struct list_head *idle_ws;
+   spinlock_t *ws_lock;
+   atomic_t *total_ws;
+   wait_queue_head_t *ws_wait;
+   int *free_ws;
+
+   if (heuristic) {
+   idle_ws  = _heuristic_ws.idle_ws;
+   ws_lock  = _heuristic_ws.ws_lock;
+   total_ws = _heuristic_ws.total_ws;
+   ws_wait  = _heuristic_ws.ws_wait;
+   free_ws  = _heuristic_ws.free_ws;
+   } else {
+   idle_ws  = _comp_ws[idx].idle_ws;
+   ws_lock  = _comp_ws[idx].ws_lock;
+   total_ws = _comp_ws[idx].total_ws;
+   ws_wait  = _comp_ws[idx].ws_wait;
+   free_ws  = _comp_ws[idx].free_ws;
+   }
+
+   spin_lock(ws_lock);
+   if (*free_ws <= num_online_cpus()) {
+   list_add(workspace, idle_ws);
+   (*free_ws)++;
+   spin_unlock(ws_lock);
+   goto wake;
+   }
+   spin_unlock(ws_lock);
+
+   if (heuristic)
+   free_heuristic_ws(workspace);
+   else
+   btrfs_compress_op[idx]->free_workspace(workspace);
+   atomic_dec(total_ws);
+wake:
+   cond_wake_up(ws_wait);
+}
+
+static void free_workspace(int type, struct list_head *ws)
+{
+   return __free_workspace(type, ws, false);
+}
+
 /*
  * This finds an available workspace or allocates a new one.
  * If it's not possible to allocate a new one, waits until there's one.
  * Preallocation makes a forward progress guarantees and we do not ret

Re: [PATCH] btrfs: add zstd compression level support

2018-10-30 Thread Omar Sandoval
On Tue, Oct 30, 2018 at 12:06:21PM -0700, Nick Terrell wrote:
> From: Jennifer Liu 
> 
> Adds zstd compression level support to btrfs. Zstd requires
> different amounts of memory for each level, so the design had
> to be modified to allow set_level() to allocate memory. We
> preallocate one workspace of the maximum size to guarantee
> forward progress. This feature is expected to be useful for
> read-mostly filesystems, or when creating images.
> 
> Benchmarks run in qemu on Intel x86 with a single core.
> The benchmark measures the time to copy the Silesia corpus [0] to
> a btrfs filesystem 10 times, then read it back.
> 
> The two important things to note are:
> - The decompression speed and memory remains constant.
>   The memory required to decompress is the same as level 1.
> - The compression speed and ratio will vary based on the source.
> 
> Level Ratio   Compression Decompression   Compression Memory
> 1 2.59153 MB/s112 MB/s0.8 MB
> 2 2.67136 MB/s113 MB/s1.0 MB
> 3 2.72106 MB/s115 MB/s1.3 MB
> 4 2.7886  MB/s109 MB/s0.9 MB
> 5 2.8369  MB/s109 MB/s1.4 MB
> 6 2.8953  MB/s110 MB/s1.5 MB
> 7 2.9140  MB/s112 MB/s1.4 MB
> 8 2.9234  MB/s110 MB/s1.8 MB
> 9 2.9327  MB/s109 MB/s1.8 MB
> 102.9422  MB/s109 MB/s1.8 MB
> 112.9517  MB/s114 MB/s1.8 MB
> 122.9513  MB/s113 MB/s1.8 MB
> 132.9510  MB/s111 MB/s2.3 MB
> 142.997   MB/s110 MB/s2.6 MB
> 153.036   MB/s110 MB/s2.6 MB
> 
> [0] http://sun.aei.polsl.pl/~sdeor/index.php?page=silesia

Reviewed-by: Omar Sandoval 

> Signed-off-by: Jennifer Liu 
> Signed-off-by: Nick Terrell 
> ---
>  fs/btrfs/compression.c | 172 +
>  fs/btrfs/compression.h |  18 +++--
>  fs/btrfs/lzo.c |   5 +-
>  fs/btrfs/super.c   |   7 +-
>  fs/btrfs/zlib.c|  33 
>  fs/btrfs/zstd.c|  74 ++
>  6 files changed, 204 insertions(+), 105 deletions(-)
> 
> diff --git a/fs/btrfs/compression.c b/fs/btrfs/compression.c
> index 2955a4ea2fa8..bd8e69381dc9 100644
> --- a/fs/btrfs/compression.c
> +++ b/fs/btrfs/compression.c
> @@ -822,11 +822,15 @@ void __init btrfs_init_compress(void)
>  
>   /*
>* Preallocate one workspace for each compression type so
> -  * we can guarantee forward progress in the worst case
> +  * we can guarantee forward progress in the worst case.
> +  * Provide the maximum compression level to guarantee large
> +  * enough workspace.
>*/
> - workspace = btrfs_compress_op[i]->alloc_workspace();
> + workspace = btrfs_compress_op[i]->alloc_workspace(
> + btrfs_compress_op[i]->max_level);
>   if (IS_ERR(workspace)) {
> - pr_warn("BTRFS: cannot preallocate compression 
> workspace, will try later\n");
> + pr_warn("BTRFS: cannot preallocate compression "
> + "workspace, will try later\n");

Nit: since you didn't change this line, don't rewrap it.


[PATCH] btrfs: add zstd compression level support

2018-10-30 Thread Nick Terrell
From: Jennifer Liu 

Adds zstd compression level support to btrfs. Zstd requires
different amounts of memory for each level, so the design had
to be modified to allow set_level() to allocate memory. We
preallocate one workspace of the maximum size to guarantee
forward progress. This feature is expected to be useful for
read-mostly filesystems, or when creating images.

Benchmarks run in qemu on Intel x86 with a single core.
The benchmark measures the time to copy the Silesia corpus [0] to
a btrfs filesystem 10 times, then read it back.

The two important things to note are:
- The decompression speed and memory remains constant.
  The memory required to decompress is the same as level 1.
- The compression speed and ratio will vary based on the source.

Level   Ratio   Compression Decompression   Compression Memory
1   2.59153 MB/s112 MB/s0.8 MB
2   2.67136 MB/s113 MB/s1.0 MB
3   2.72106 MB/s115 MB/s1.3 MB
4   2.7886  MB/s109 MB/s0.9 MB
5   2.8369  MB/s109 MB/s1.4 MB
6   2.8953  MB/s110 MB/s1.5 MB
7   2.9140  MB/s112 MB/s1.4 MB
8   2.9234  MB/s110 MB/s1.8 MB
9   2.9327  MB/s109 MB/s1.8 MB
10  2.9422  MB/s109 MB/s1.8 MB
11  2.9517  MB/s114 MB/s1.8 MB
12  2.9513  MB/s113 MB/s1.8 MB
13  2.9510  MB/s111 MB/s2.3 MB
14  2.997   MB/s110 MB/s2.6 MB
15  3.036   MB/s110 MB/s2.6 MB

[0] http://sun.aei.polsl.pl/~sdeor/index.php?page=silesia

Signed-off-by: Jennifer Liu 
Signed-off-by: Nick Terrell 
---
 fs/btrfs/compression.c | 172 +
 fs/btrfs/compression.h |  18 +++--
 fs/btrfs/lzo.c |   5 +-
 fs/btrfs/super.c   |   7 +-
 fs/btrfs/zlib.c|  33 
 fs/btrfs/zstd.c|  74 ++
 6 files changed, 204 insertions(+), 105 deletions(-)

diff --git a/fs/btrfs/compression.c b/fs/btrfs/compression.c
index 2955a4ea2fa8..bd8e69381dc9 100644
--- a/fs/btrfs/compression.c
+++ b/fs/btrfs/compression.c
@@ -822,11 +822,15 @@ void __init btrfs_init_compress(void)
 
/*
 * Preallocate one workspace for each compression type so
-* we can guarantee forward progress in the worst case
+* we can guarantee forward progress in the worst case.
+* Provide the maximum compression level to guarantee large
+* enough workspace.
 */
-   workspace = btrfs_compress_op[i]->alloc_workspace();
+   workspace = btrfs_compress_op[i]->alloc_workspace(
+   btrfs_compress_op[i]->max_level);
if (IS_ERR(workspace)) {
-   pr_warn("BTRFS: cannot preallocate compression 
workspace, will try later\n");
+   pr_warn("BTRFS: cannot preallocate compression "
+   "workspace, will try later\n");
} else {
atomic_set(_comp_ws[i].total_ws, 1);
btrfs_comp_ws[i].free_ws = 1;
@@ -835,23 +839,78 @@ void __init btrfs_init_compress(void)
}
 }
 
+/*
+ * put a workspace struct back on the list or free it if we have enough
+ * idle ones sitting around
+ */
+static void __free_workspace(int type, struct list_head *workspace,
+bool heuristic)
+{
+   int idx = type - 1;
+   struct list_head *idle_ws;
+   spinlock_t *ws_lock;
+   atomic_t *total_ws;
+   wait_queue_head_t *ws_wait;
+   int *free_ws;
+
+   if (heuristic) {
+   idle_ws  = _heuristic_ws.idle_ws;
+   ws_lock  = _heuristic_ws.ws_lock;
+   total_ws = _heuristic_ws.total_ws;
+   ws_wait  = _heuristic_ws.ws_wait;
+   free_ws  = _heuristic_ws.free_ws;
+   } else {
+   idle_ws  = _comp_ws[idx].idle_ws;
+   ws_lock  = _comp_ws[idx].ws_lock;
+   total_ws = _comp_ws[idx].total_ws;
+   ws_wait  = _comp_ws[idx].ws_wait;
+   free_ws  = _comp_ws[idx].free_ws;
+   }
+
+   spin_lock(ws_lock);
+   if (*free_ws <= num_online_cpus()) {
+   list_add(workspace, idle_ws);
+   (*free_ws)++;
+   spin_unlock(ws_lock);
+   goto wake;
+   }
+   spin_unlock(ws_lock);
+
+   if (heuristic)
+   free_heuristic_ws(workspace);
+   else
+   btrfs_compress_op[idx]->free_workspace(workspace);
+   atomic_dec(total_ws);
+wake:
+   cond_wake_up(ws_wait);
+}
+
+static void free_workspace(int type, struct list_head *ws)
+{
+   return __free_workspace(type, ws, false);
+}
+
 /*
  * T

Re: [PATCH] btrfs: property: Set incompat flag of lzo/zstd compression

2018-05-15 Thread David Sterba
On Tue, May 15, 2018 at 04:51:26PM +0900, Misono Tomohiro wrote:
> Incompat flag of lzo/zstd compression should be set at:
>  1. mount time (-o compress/compress-force)
>  2. when defrag is done
>  3. when property is set
> 
> Currently 3. is missing and this commit adds this.

That was missed during the review, thanks for catching it.

> Signed-off-by: Tomohiro Misono <misono.tomoh...@jp.fujitsu.com>

Reviewed-by: David Sterba <dste...@suse.com>

For 4.17-rc and for stable 4.14+.
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH] btrfs: property: Set incompat flag of lzo/zstd compression

2018-05-15 Thread Su Yue


On 05/15/2018 04:35 PM, Duncan wrote:
> Su Yue posted on Tue, 15 May 2018 16:05:01 +0800 as excerpted:
> 
> 
>>
>> On 05/15/2018 03:51 PM, Misono Tomohiro wrote:
>>> Incompat flag of lzo/zstd compression should be set at:
>>>  1. mount time (-o compress/compress-force)
>>>  2. when defrag is done 3. when property is set
>>>
>>> Currently 3. is missing and this commit adds this.
>>>
>>>
>> If I don't misunderstand, compression property of an inode is only apply
>> for *the* inode, not the whole filesystem.
>> So the original logical should be okay.
> 
> But the inode is on the filesystem, and if it's compressed with lzo/zstd, 
> the incompat flag should be set to avoid mounting with an earlier kernel 
> that doesn't understand that compression and would therefore, if we're 
> lucky, simply fail to read the data compressed in that file/inode.  (If 
> we're unlucky it could blow up with kernel memory corruption like James 
> Harvey's current case of unexpected, corrupted compressed data in a nocow 
> file that being nocow, doesn't have csum validation to fail and abort the 
> decompression, and shouldn't be compressed at all.)
> 
> So better to set the incompat flag and refuse to mount at all on kernels 
> that don't have the required compression support.
> 

Get it.
As your conclusion, it's indeed better to set the incompat flag.

Thanks,
Su




pEpkey.asc
Description: application/pgp-keys


Re: [PATCH] btrfs: property: Set incompat flag of lzo/zstd compression

2018-05-15 Thread Duncan
Su Yue posted on Tue, 15 May 2018 16:05:01 +0800 as excerpted:


> 
> On 05/15/2018 03:51 PM, Misono Tomohiro wrote:
>> Incompat flag of lzo/zstd compression should be set at:
>>  1. mount time (-o compress/compress-force)
>>  2. when defrag is done 3. when property is set
>> 
>> Currently 3. is missing and this commit adds this.
>> 
>> 
> If I don't misunderstand, compression property of an inode is only apply
> for *the* inode, not the whole filesystem.
> So the original logical should be okay.

But the inode is on the filesystem, and if it's compressed with lzo/zstd, 
the incompat flag should be set to avoid mounting with an earlier kernel 
that doesn't understand that compression and would therefore, if we're 
lucky, simply fail to read the data compressed in that file/inode.  (If 
we're unlucky it could blow up with kernel memory corruption like James 
Harvey's current case of unexpected, corrupted compressed data in a nocow 
file that being nocow, doesn't have csum validation to fail and abort the 
decompression, and shouldn't be compressed at all.)

So better to set the incompat flag and refuse to mount at all on kernels 
that don't have the required compression support.

-- 
Duncan - List replies preferred.   No HTML msgs.
"Every nonfree program has a lord, a master --
and if you use the program, he is your master."  Richard Stallman

--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH] btrfs: property: Set incompat flag of lzo/zstd compression

2018-05-15 Thread Su Yue


On 05/15/2018 04:05 PM, Su Yue wrote:
> 
> 
> On 05/15/2018 03:51 PM, Misono Tomohiro wrote:
>> Incompat flag of lzo/zstd compression should be set at:
>>  1. mount time (-o compress/compress-force)
>>  2. when defrag is done
>>  3. when property is set
>>
>> Currently 3. is missing and this commit adds this.
>>
> 
> If I don't misunderstand, compression property of an inode is only

Embarrassed for bad memory about btrfs_set_fs_incompat().
The patch is fine. Just ignore this thread.

> apply for *the* inode, not the whole filesystem.
> So the original logical should be okay.
> 
> Thanks,
> Su
> 
>> Signed-off-by: Tomohiro Misono <misono.tomoh...@jp.fujitsu.com>
>> ---
>>  fs/btrfs/props.c | 12 
>>  1 file changed, 8 insertions(+), 4 deletions(-)
>>
>> diff --git a/fs/btrfs/props.c b/fs/btrfs/props.c
>> index 53a8c95828e3..dc6140013ae8 100644
>> --- a/fs/btrfs/props.c
>> +++ b/fs/btrfs/props.c
>> @@ -380,6 +380,7 @@ static int prop_compression_apply(struct inode *inode,
>>const char *value,
>>size_t len)
>>  {
>> +struct btrfs_fs_info *fs_info = btrfs_sb(inode->i_sb);
>>  int type;
>>  
>>  if (len == 0) {
>> @@ -390,14 +391,17 @@ static int prop_compression_apply(struct inode *inode,
>>  return 0;
>>  }
>>  
>> -if (!strncmp("lzo", value, 3))
>> +if (!strncmp("lzo", value, 3)) {
>>  type = BTRFS_COMPRESS_LZO;
>> -else if (!strncmp("zlib", value, 4))
>> +btrfs_set_fs_incompat(fs_info, COMPRESS_LZO);
>> +} else if (!strncmp("zlib", value, 4)) {
>>  type = BTRFS_COMPRESS_ZLIB;
>> -else if (!strncmp("zstd", value, len))
>> +} else if (!strncmp("zstd", value, len)) {
>>  type = BTRFS_COMPRESS_ZSTD;
>> -else
>> +btrfs_set_fs_incompat(fs_info, COMPRESS_ZSTD);
>> +} else {
>>  return -EINVAL;
>> +}
>>  
>>  BTRFS_I(inode)->flags &= ~BTRFS_INODE_NOCOMPRESS;
>>  BTRFS_I(inode)->flags |= BTRFS_INODE_COMPRESS;
>>
> 
> 




pEpkey.asc
Description: application/pgp-keys


Re: [PATCH] btrfs: property: Set incompat flag of lzo/zstd compression

2018-05-15 Thread Anand Jain



On 05/15/2018 03:51 PM, Misono Tomohiro wrote:

Incompat flag of lzo/zstd compression should be set at:
  1. mount time (-o compress/compress-force)
  2. when defrag is done
  3. when property is set

Currently 3. is missing and this commit adds this.

Signed-off-by: Tomohiro Misono <misono.tomoh...@jp.fujitsu.com>


Reviewed-by: Anand Jain <anand.j...@oracle.com>
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH] btrfs: property: Set incompat flag of lzo/zstd compression

2018-05-15 Thread Su Yue


On 05/15/2018 03:51 PM, Misono Tomohiro wrote:
> Incompat flag of lzo/zstd compression should be set at:
>  1. mount time (-o compress/compress-force)
>  2. when defrag is done
>  3. when property is set
> 
> Currently 3. is missing and this commit adds this.
> 

If I don't misunderstand, compression property of an inode is only
apply for *the* inode, not the whole filesystem.
So the original logical should be okay.

Thanks,
Su

> Signed-off-by: Tomohiro Misono <misono.tomoh...@jp.fujitsu.com>
> ---
>  fs/btrfs/props.c | 12 
>  1 file changed, 8 insertions(+), 4 deletions(-)
> 
> diff --git a/fs/btrfs/props.c b/fs/btrfs/props.c
> index 53a8c95828e3..dc6140013ae8 100644
> --- a/fs/btrfs/props.c
> +++ b/fs/btrfs/props.c
> @@ -380,6 +380,7 @@ static int prop_compression_apply(struct inode *inode,
> const char *value,
> size_t len)
>  {
> + struct btrfs_fs_info *fs_info = btrfs_sb(inode->i_sb);
>   int type;
>  
>   if (len == 0) {
> @@ -390,14 +391,17 @@ static int prop_compression_apply(struct inode *inode,
>   return 0;
>   }
>  
> - if (!strncmp("lzo", value, 3))
> + if (!strncmp("lzo", value, 3)) {
>   type = BTRFS_COMPRESS_LZO;
> - else if (!strncmp("zlib", value, 4))
> + btrfs_set_fs_incompat(fs_info, COMPRESS_LZO);
> + } else if (!strncmp("zlib", value, 4)) {
>   type = BTRFS_COMPRESS_ZLIB;
> - else if (!strncmp("zstd", value, len))
> + } else if (!strncmp("zstd", value, len)) {
>   type = BTRFS_COMPRESS_ZSTD;
> - else
> + btrfs_set_fs_incompat(fs_info, COMPRESS_ZSTD);
> + } else {
>   return -EINVAL;
> + }
>  
>   BTRFS_I(inode)->flags &= ~BTRFS_INODE_NOCOMPRESS;
>   BTRFS_I(inode)->flags |= BTRFS_INODE_COMPRESS;
> 




pEpkey.asc
Description: application/pgp-keys


[PATCH] btrfs: property: Set incompat flag of lzo/zstd compression

2018-05-15 Thread Misono Tomohiro
Incompat flag of lzo/zstd compression should be set at:
 1. mount time (-o compress/compress-force)
 2. when defrag is done
 3. when property is set

Currently 3. is missing and this commit adds this.

Signed-off-by: Tomohiro Misono <misono.tomoh...@jp.fujitsu.com>
---
 fs/btrfs/props.c | 12 
 1 file changed, 8 insertions(+), 4 deletions(-)

diff --git a/fs/btrfs/props.c b/fs/btrfs/props.c
index 53a8c95828e3..dc6140013ae8 100644
--- a/fs/btrfs/props.c
+++ b/fs/btrfs/props.c
@@ -380,6 +380,7 @@ static int prop_compression_apply(struct inode *inode,
  const char *value,
  size_t len)
 {
+   struct btrfs_fs_info *fs_info = btrfs_sb(inode->i_sb);
int type;
 
if (len == 0) {
@@ -390,14 +391,17 @@ static int prop_compression_apply(struct inode *inode,
return 0;
}
 
-   if (!strncmp("lzo", value, 3))
+   if (!strncmp("lzo", value, 3)) {
type = BTRFS_COMPRESS_LZO;
-   else if (!strncmp("zlib", value, 4))
+   btrfs_set_fs_incompat(fs_info, COMPRESS_LZO);
+   } else if (!strncmp("zlib", value, 4)) {
type = BTRFS_COMPRESS_ZLIB;
-   else if (!strncmp("zstd", value, len))
+   } else if (!strncmp("zstd", value, len)) {
type = BTRFS_COMPRESS_ZSTD;
-   else
+   btrfs_set_fs_incompat(fs_info, COMPRESS_ZSTD);
+   } else {
return -EINVAL;
+   }
 
BTRFS_I(inode)->flags &= ~BTRFS_INODE_NOCOMPRESS;
BTRFS_I(inode)->flags |= BTRFS_INODE_COMPRESS;
-- 
2.14.3


--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: zstd compression

2017-11-16 Thread Timofey Titovets
2017-11-16 19:32 GMT+03:00 Austin S. Hemmelgarn :
> On 2017-11-16 08:43, Duncan wrote:
>>
>> Austin S. Hemmelgarn posted on Thu, 16 Nov 2017 07:30:47 -0500 as
>> excerpted:
>>
>>> On 2017-11-15 16:31, Duncan wrote:

 Austin S. Hemmelgarn posted on Wed, 15 Nov 2017 07:57:06 -0500 as
 excerpted:

> The 'compress' and 'compress-force' mount options only impact newly
> written data.  The compression used is stored with the metadata for
> the extents themselves, so any existing data on the volume will be
> read just fine with whatever compression method it was written with,
> while new data will be written with the specified compression method.
>
> If you want to convert existing files, you can use the '-c' option to
> the defrag command to do so.


 ... Being aware of course that using defrag to recompress files like
 that will break 100% of the existing reflinks, effectively (near)
 doubling data usage if the files are snapshotted, since the snapshot
 will now share 0% of its extents with the newly compressed files.
>>>
>>> Good point, I forgot to mention that.


 (The actual effect shouldn't be quite that bad, as some files are
 likely to be uncompressed due to not compressing well, and I'm not sure
 if defrag -c rewrites them or not.  Further, if there's multiple
 snapshots data usage should only double with respect to the latest one,
 the data delta between it and previous snapshots won't be doubled as
 well.)
>>>
>>> I'm pretty sure defrag is equivalent to 'compress-force', not
>>> 'compress', but I may be wrong.
>>
>>
>> But... compress-force doesn't actually force compression _all_ the time.
>> Rather, it forces btrfs to continue checking whether compression is worth
>> it for each "block"[1] of the file, instead of giving up if the first
>> quick try at the beginning says that block won't compress.
>>
>> So what I'm saying is that if the snapshotted data is already compressed,
>> think (pre-)compressed tarballs or image files such as jpeg that are
>> unlikely to /easily/ compress further and might well actually be _bigger_
>> once the compression algorithm is run over them, defrag -c will likely
>> fail to compress them further even if it's the equivalent of compress-
>> force, and thus /should/ leave them as-is, not breaking the reflinks of
>> the snapshots and thus not doubling the data usage for that file, or more
>> exactly, that extent of that file.
>>
>> Tho come to think of it, is defrag -c that smart, to actually leave the
>> data as-is if it doesn't compress further, or does it still rewrite it
>> even if it doesn't compress, thus breaking the reflink and doubling the
>> usage regardless?
>
> I'm not certain how compression factors in, but if you aren't compressing
> the file, it will only get rewritten if it's fragmented (which is shy
> defragmenting the system root directory is usually insanely fast on most
> systems, stuff there is almost never fragmented).
>>
>>
>> ---
>> [1] Block:  I'm not positive it's the usual 4K block in this case.  I
>> think I read that it's 16K, but I might be confused on that.  But
>> regardless the size, the point is, with compress-force btrfs won't give
>> up like simple compress will if the first "block" doesn't compress, it'll
>> keep trying.
>>
>> Of course the new compression heuristic changes this a bit too, but the
>> same general idea holds, compress-force continues to try for the entire
>> file, compress will give up much faster.
>
> I'm not actually sure, I would think it checks 128k blocks of data (the
> effective block size for compression), but if it doesn't it should be
> checking at the filesystem block size (which means 16k on most recently
> created filesystems).
>
> --
> To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
> the body of a message to majord...@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html

Defragment of data on btrfs, is simply rewrite data if, data doesn't
meet some criteria.
And only that -c does, it's say which compression method apply for new
written data, no more, no less.
On write side, FS see long/short data ranges for writing (see
compress_file_range()), if compression needed, split data to 128KiB
and pass it to compression logic.
compression logic give up it self in 2 cases:
1. Compression of 2 (or 3?) first page sized blocks of 128KiB make
data bigger -> give up -> write data as is
2. After compression done, if compression not free at least one sector
size -> write data as is

i.e.
If you write 16 KiB at time, btrfs will compress each separate write as 16 KiB.
If you write 1 MiB at time, btrfs will split it by 128 KiB.
If you write 1025KiB, btrfs will split it by 128 KiB and last 1 KiB
will be written as is.

JFYI:
Only that heuristic logic doing (i.e. compress, not compress-force) is:
On every write, kernel check if compression are 

Re: zstd compression

2017-11-16 Thread Austin S. Hemmelgarn

On 2017-11-16 08:43, Duncan wrote:

Austin S. Hemmelgarn posted on Thu, 16 Nov 2017 07:30:47 -0500 as
excerpted:


On 2017-11-15 16:31, Duncan wrote:

Austin S. Hemmelgarn posted on Wed, 15 Nov 2017 07:57:06 -0500 as
excerpted:


The 'compress' and 'compress-force' mount options only impact newly
written data.  The compression used is stored with the metadata for
the extents themselves, so any existing data on the volume will be
read just fine with whatever compression method it was written with,
while new data will be written with the specified compression method.

If you want to convert existing files, you can use the '-c' option to
the defrag command to do so.


... Being aware of course that using defrag to recompress files like
that will break 100% of the existing reflinks, effectively (near)
doubling data usage if the files are snapshotted, since the snapshot
will now share 0% of its extents with the newly compressed files.

Good point, I forgot to mention that.


(The actual effect shouldn't be quite that bad, as some files are
likely to be uncompressed due to not compressing well, and I'm not sure
if defrag -c rewrites them or not.  Further, if there's multiple
snapshots data usage should only double with respect to the latest one,
the data delta between it and previous snapshots won't be doubled as
well.)

I'm pretty sure defrag is equivalent to 'compress-force', not
'compress', but I may be wrong.


But... compress-force doesn't actually force compression _all_ the time.
Rather, it forces btrfs to continue checking whether compression is worth
it for each "block"[1] of the file, instead of giving up if the first
quick try at the beginning says that block won't compress.

So what I'm saying is that if the snapshotted data is already compressed,
think (pre-)compressed tarballs or image files such as jpeg that are
unlikely to /easily/ compress further and might well actually be _bigger_
once the compression algorithm is run over them, defrag -c will likely
fail to compress them further even if it's the equivalent of compress-
force, and thus /should/ leave them as-is, not breaking the reflinks of
the snapshots and thus not doubling the data usage for that file, or more
exactly, that extent of that file.

Tho come to think of it, is defrag -c that smart, to actually leave the
data as-is if it doesn't compress further, or does it still rewrite it
even if it doesn't compress, thus breaking the reflink and doubling the
usage regardless?
I'm not certain how compression factors in, but if you aren't 
compressing the file, it will only get rewritten if it's fragmented 
(which is shy defragmenting the system root directory is usually 
insanely fast on most systems, stuff there is almost never fragmented).


---
[1] Block:  I'm not positive it's the usual 4K block in this case.  I
think I read that it's 16K, but I might be confused on that.  But
regardless the size, the point is, with compress-force btrfs won't give
up like simple compress will if the first "block" doesn't compress, it'll
keep trying.

Of course the new compression heuristic changes this a bit too, but the
same general idea holds, compress-force continues to try for the entire
file, compress will give up much faster.
I'm not actually sure, I would think it checks 128k blocks of data (the 
effective block size for compression), but if it doesn't it should be 
checking at the filesystem block size (which means 16k on most recently 
created filesystems).

--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: zstd compression

2017-11-16 Thread Duncan
Austin S. Hemmelgarn posted on Thu, 16 Nov 2017 07:30:47 -0500 as
excerpted:

> On 2017-11-15 16:31, Duncan wrote:
>> Austin S. Hemmelgarn posted on Wed, 15 Nov 2017 07:57:06 -0500 as
>> excerpted:
>> 
>>> The 'compress' and 'compress-force' mount options only impact newly
>>> written data.  The compression used is stored with the metadata for
>>> the extents themselves, so any existing data on the volume will be
>>> read just fine with whatever compression method it was written with,
>>> while new data will be written with the specified compression method.
>>>
>>> If you want to convert existing files, you can use the '-c' option to
>>> the defrag command to do so.
>> 
>> ... Being aware of course that using defrag to recompress files like
>> that will break 100% of the existing reflinks, effectively (near)
>> doubling data usage if the files are snapshotted, since the snapshot
>> will now share 0% of its extents with the newly compressed files.
> Good point, I forgot to mention that.
>> 
>> (The actual effect shouldn't be quite that bad, as some files are
>> likely to be uncompressed due to not compressing well, and I'm not sure
>> if defrag -c rewrites them or not.  Further, if there's multiple
>> snapshots data usage should only double with respect to the latest one,
>> the data delta between it and previous snapshots won't be doubled as
>> well.)
> I'm pretty sure defrag is equivalent to 'compress-force', not
> 'compress', but I may be wrong.

But... compress-force doesn't actually force compression _all_ the time.  
Rather, it forces btrfs to continue checking whether compression is worth 
it for each "block"[1] of the file, instead of giving up if the first 
quick try at the beginning says that block won't compress.  

So what I'm saying is that if the snapshotted data is already compressed, 
think (pre-)compressed tarballs or image files such as jpeg that are 
unlikely to /easily/ compress further and might well actually be _bigger_ 
once the compression algorithm is run over them, defrag -c will likely 
fail to compress them further even if it's the equivalent of compress-
force, and thus /should/ leave them as-is, not breaking the reflinks of 
the snapshots and thus not doubling the data usage for that file, or more 
exactly, that extent of that file.

Tho come to think of it, is defrag -c that smart, to actually leave the 
data as-is if it doesn't compress further, or does it still rewrite it 
even if it doesn't compress, thus breaking the reflink and doubling the 
usage regardless?

---
[1] Block:  I'm not positive it's the usual 4K block in this case.  I 
think I read that it's 16K, but I might be confused on that.  But 
regardless the size, the point is, with compress-force btrfs won't give 
up like simple compress will if the first "block" doesn't compress, it'll 
keep trying.

Of course the new compression heuristic changes this a bit too, but the 
same general idea holds, compress-force continues to try for the entire 
file, compress will give up much faster. 

-- 
Duncan - List replies preferred.   No HTML msgs.
"Every nonfree program has a lord, a master --
and if you use the program, he is your master."  Richard Stallman

--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: zstd compression

2017-11-16 Thread Imran Geriskovan
On 11/16/17, Austin S. Hemmelgarn  wrote:
> I'm pretty sure defrag is equivalent to 'compress-force', not
> 'compress', but I may be wrong.

Are there any devs to confirm this?
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: zstd compression

2017-11-16 Thread Austin S. Hemmelgarn

On 2017-11-15 16:31, Duncan wrote:

Austin S. Hemmelgarn posted on Wed, 15 Nov 2017 07:57:06 -0500 as
excerpted:


The 'compress' and 'compress-force' mount options only impact newly
written data.  The compression used is stored with the metadata for the
extents themselves, so any existing data on the volume will be read just
fine with whatever compression method it was written with, while new
data will be written with the specified compression method.

If you want to convert existing files, you can use the '-c' option to
the defrag command to do so.


... Being aware of course that using defrag to recompress files like that
will break 100% of the existing reflinks, effectively (near) doubling
data usage if the files are snapshotted, since the snapshot will now
share 0% of its extents with the newly compressed files.

Good point, I forgot to mention that.


(The actual effect shouldn't be quite that bad, as some files are likely
to be uncompressed due to not compressing well, and I'm not sure if
defrag -c rewrites them or not.  Further, if there's multiple snapshots
data usage should only double with respect to the latest one, the data
delta between it and previous snapshots won't be doubled as well.)
I'm pretty sure defrag is equivalent to 'compress-force', not 
'compress', but I may be wrong.


While this makes sense if you think about it, it may not occur to some
people until they've actually tried it, and see their data usage go way
up instead of going down as they intuitively expected.  There have been
posts to the list...

Of course if the data isn't snapshotted this doesn't apply and defrag -c
to zstd should be fine. =:^)



--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: zstd compression

2017-11-15 Thread Duncan
Austin S. Hemmelgarn posted on Wed, 15 Nov 2017 07:57:06 -0500 as
excerpted:

> The 'compress' and 'compress-force' mount options only impact newly
> written data.  The compression used is stored with the metadata for the
> extents themselves, so any existing data on the volume will be read just
> fine with whatever compression method it was written with, while new
> data will be written with the specified compression method.
> 
> If you want to convert existing files, you can use the '-c' option to
> the defrag command to do so.

... Being aware of course that using defrag to recompress files like that 
will break 100% of the existing reflinks, effectively (near) doubling 
data usage if the files are snapshotted, since the snapshot will now 
share 0% of its extents with the newly compressed files.

(The actual effect shouldn't be quite that bad, as some files are likely 
to be uncompressed due to not compressing well, and I'm not sure if 
defrag -c rewrites them or not.  Further, if there's multiple snapshots 
data usage should only double with respect to the latest one, the data 
delta between it and previous snapshots won't be doubled as well.)

While this makes sense if you think about it, it may not occur to some 
people until they've actually tried it, and see their data usage go way 
up instead of going down as they intuitively expected.  There have been 
posts to the list...

Of course if the data isn't snapshotted this doesn't apply and defrag -c 
to zstd should be fine. =:^)

-- 
Duncan - List replies preferred.   No HTML msgs.
"Every nonfree program has a lord, a master --
and if you use the program, he is your master."  Richard Stallman

--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: zstd compression

2017-11-15 Thread Austin S. Hemmelgarn

On 2017-11-15 05:35, Imran Geriskovan wrote:

On 11/15/17, Lukas Pirl  wrote:

you might be interested in the thread "Read before you deploy
btrfs + zstd"¹.


Thanks. I've read it. Bootloader is not an issue since /boot is on
another uncompressed fs.

Let me make my question more generic:

Can there be any issues for switching mount time
compressions options from one to another, in any order?
(i.e none -> lzo -> zlib -> zstd -> none -> ...)

zstd is only a newcomer so my question applies to all
combinations..
The 'compress' and 'compress-force' mount options only impact newly 
written data.  The compression used is stored with the metadata for the 
extents themselves, so any existing data on the volume will be read just 
fine with whatever compression method it was written with, while new 
data will be written with the specified compression method.


If you want to convert existing files, you can use the '-c' option to 
the defrag command to do so.


Aside from this, there is one other thing to keep in mind about zstd 
which I mentioned later in the above mentioned thread.  Most system 
recovery tools do not yet have a new enough version of the kernel and/or 
 btrfs-progs to be able to access BTRFS volumes with zstd compressed 
data or metadata, so you may need to roll your own recovery solution for 
the time being if you want to use zstd.

--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: zstd compression

2017-11-15 Thread Imran Geriskovan
On 11/15/17, Lukas Pirl  wrote:
> you might be interested in the thread "Read before you deploy
> btrfs + zstd"¹.

Thanks. I've read it. Bootloader is not an issue since /boot is on
another uncompressed fs.

Let me make my question more generic:

Can there be any issues for switching mount time
compressions options from one to another, in any order?
(i.e none -> lzo -> zlib -> zstd -> none -> ...)

zstd is only a newcomer so my question applies to all
combinations..
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: zstd compression

2017-11-15 Thread Lukas Pirl
Hi Imran,

On 11/15/2017 09:51 AM, Imran Geriskovan wrote as excerpted:
> Any further advices?

you might be interested in the thread "Read before you deploy btrfs +
zstd"¹.

Cheers,

Lukas

¹ https://www.mail-archive.com/linux-btrfs@vger.kernel.org/msg69871.html
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


zstd compression

2017-11-15 Thread Imran Geriskovan
Kernel 4.14 now includes btrfs zstd compression support.

My question:
I currently have a fs mounted and used with "compress=lzo"
option. What happens if I change it to "compress=zstd"?

My guess is that existing files will be read and uncompressed via lzo.
And new files will be written with zstd compression. And
everything will run smoothly.

Is this optimistic guess valid? What are possible pitfalls,
if there are any? Any further advices?

Regards,
Imran
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html