Re: [GIT PULL][PATCH v9 0/3] Update to zstd-1.4.10

2021-04-14 Thread Nick Terrell
On Wed, Apr 14, 2021 at 12:04 PM Eric Biggers  wrote:
>
> On Wed, Apr 14, 2021 at 11:53:51AM -0700, Nick Terrell wrote:
> > On Wed, Apr 14, 2021 at 11:35 AM Eric Biggers  wrote:
> > >
> > > On Wed, Apr 14, 2021 at 11:01:29AM -0700, Nick Terrell wrote:
> > > > Hi all,
> > > >
> > > > I would really like to make some progress on this and get it merged.
> > > > This patchset offsers:
> > > > * 15-30% better decompression speed
> > > > * 3 years of zstd bug fixes and code improvements
> > > > * Allows us to import zstd directly from upstream so we don't fall 3
> > > > years out of date again
> > > >
> > > > Thanks,
> > > > Nick
> > > >
> > >
> > > I think it would help get it merged if someone actually volunteered to 
> > > maintain
> > > it.  As-is there is no entry in MAINTAINERS for this code.
> >
> > I was discussing with Chris Mason about volunteering to maintain the
> > code myself.
> > We wanted to wait until this series got merged before going that
> > route, because there
> > was already a lot of comments about it, and I didn't want to appear to
> > be trying to bypass
> > any review or criticisms. But, please let me know what you think.
> >
>
> I expect that most people would like to see a commitment to maintain this code
> before merging.  The usual way to do that is to add a MAINTAINERS entry.
>
> Otherwise it is 27000 lines of code dumped on other people to maintain.

I will add a 4th patch in the series to update the MAINTAINERS.

> - Eric


Re: [GIT PULL][PATCH v9 0/3] Update to zstd-1.4.10

2021-04-14 Thread Nick Terrell
On Wed, Apr 14, 2021 at 11:35 AM Eric Biggers  wrote:
>
> On Wed, Apr 14, 2021 at 11:01:29AM -0700, Nick Terrell wrote:
> > Hi all,
> >
> > I would really like to make some progress on this and get it merged.
> > This patchset offsers:
> > * 15-30% better decompression speed
> > * 3 years of zstd bug fixes and code improvements
> > * Allows us to import zstd directly from upstream so we don't fall 3
> > years out of date again
> >
> > Thanks,
> > Nick
> >
>
> I think it would help get it merged if someone actually volunteered to 
> maintain
> it.  As-is there is no entry in MAINTAINERS for this code.

I was discussing with Chris Mason about volunteering to maintain the
code myself.
We wanted to wait until this series got merged before going that
route, because there
was already a lot of comments about it, and I didn't want to appear to
be trying to bypass
any review or criticisms. But, please let me know what you think.

Best,
Nick

> - Eric


Re: [GIT PULL][PATCH v9 0/3] Update to zstd-1.4.10

2021-04-14 Thread Nick Terrell
Hi all,

I would really like to make some progress on this and get it merged.
This patchset offsers:
* 15-30% better decompression speed
* 3 years of zstd bug fixes and code improvements
* Allows us to import zstd directly from upstream so we don't fall 3
years out of date again

Thanks,
Nick

On Fri, Apr 9, 2021 at 2:39 PM Nick Terrell  wrote:
>
> What can I do to help get this merged?
>
> Cristoph, is this new patch series with the kernel wrapper API satisfactory?
>
> Best,
> Nick
>
> On Tue, Mar 30, 2021 at 3:45 PM Nick Terrell  wrote:
> >
> > From: Nick Terrell 
> >
> > Please pull from
> >
> >   g...@github.com:terrelln/linux.git tags/v9-zstd-1.4.10
> >
> > to get these changes. Alternatively the patchset is included.
> >
> > This patchset upgrades the zstd library to the latest upstream release. The
> > current zstd version in the kernel is a modified version of upstream 
> > zstd-1.3.1.
> > At the time it was integrated, zstd wasn't ready to be used in the kernel 
> > as-is.
> > But, it is now possible to use upstream zstd directly in the kernel.
> >
> > I have not yet released zstd-1.4.10 upstream. I want the zstd version in the
> > kernel to match up with a known upstream release, so we know exactly what 
> > code
> > is running. Whenever this patchset is ready for merge, I will cut a release 
> > at
> > the upstream commit that gets merged. This should not be necessary for 
> > future
> > releases.
> >
> > The kernel zstd library is automatically generated from upstream zstd. A 
> > script
> > makes the necessary changes and imports it into the kernel. The changes are:
> >
> > 1. Replace all libc dependencies with kernel replacements and rewrite 
> > includes.
> > 2. Remove unncessary portability macros like: #if defined(_MSC_VER).
> > 3. Use the kernel xxhash instead of bundling it.
> >
> > This automation gets tested every commit by upstream's continuous 
> > integration.
> > When we cut a new zstd release, we will submit a patch to the kernel to 
> > update
> > the zstd version in the kernel.
> >
> > I've updated zstd to upstream with one big patch because every commit must 
> > build,
> > so that precludes partial updates. Since the commit is 100% generated, I 
> > hope the
> > review burden is lightened. I considered replaying upstream commits, but 
> > that is
> > not possible because there have been ~3500 upstream commits since the last 
> > zstd
> > import, and the commits don't all build individually. The bulk update 
> > preserves
> > bisectablity because bugs can be bisected to the zstd version update. At 
> > that
> > point the update can be reverted, and we can work with upstream to find and 
> > fix
> > the bug. After this big switch in how the kernel consumes zstd, future 
> > patches
> > will be smaller, because they will only have one upstream release worth of
> > changes each.
> >
> > This patchset adds a new kernel-style wrapper around zstd. This wrapper API 
> > is
> > functionally equivalent to the subset of the current zstd API that is 
> > currently
> > used. The wrapper API changes to be kernel style so that the symbols don't
> > collide with zstd's symbols. The update to zstd-1.4.6 maintains the same API
> > and preserves the semantics, so that none of the callers need to be updated.
> >
> > This patchset comes in 2 parts:
> > 1. The first 2 patches prepare for the zstd upgrade. The first patch adds 
> > the
> >new kernel style API so zstd can be upgraded without modifying any 
> > callers.
> >The second patch adds an indirection for the lib/decompress_unzstd.c
> >including of all decompression source files.
> > 2. Import zstd-1.4.10. This patch is completely generated from upstream 
> > using
> >automated tooling.
> >
> > I tested every caller of zstd on x86_64. I tested both after the 1.4.10 
> > upgrade
> > using the compatibility wrapper, and after the final patch in this series.
> >
> > I tested kernel and initramfs decompression in i386 and arm.
> >
> > I ran benchmarks to compare the current zstd in the kernel with zstd-1.4.6.
> > I benchmarked on x86_64 using QEMU with KVM enabled on an Intel i9-9900k.
> > I found:
> > * BtrFS zstd compression at levels 1 and 3 is 5% faster
> > * BtrFS zstd decompression+read is 15% faster
> > * SquashFS zstd decompression+read is 15% faster
> > * F2FS zstd compression+write at level 3 is 8% faster
> > * F2FS zstd decompression+read is 20

Re: [PATCH -next] lib: zstd: Make symbol 'HUF_compressWeights_wksp' static

2021-04-09 Thread Nick Terrell



> On Apr 8, 2021, at 8:09 PM, Miguel Ojeda  
> wrote:
> 
> On Fri, Apr 9, 2021 at 2:20 AM Nick Desaulniers  
> wrote:
>> 
>> Quite a few other functions are declared in a header, but I don't see
>> any existing callers in tree.  I wonder if the maintainer could
>> consider cleaning these up so that we don't retain them in binaries
>> without dead code elimination enabled, or if there's a need to keep
>> this code in line with an external upstream codebase?
> 
> Yeah, the equivalent cleanup was done upstream by Nick in 2018 [1],
> but there has been no major update to lib/zstd since 2017.
> 
> Thus a cleanup would actually make it closer to upstream, which is the
> best case scenario :)
> 
>Reviewed-by: Miguel Ojeda 
> 
> [1] 
> https://github.com/facebook/zstd/commit/f2d6db45cd28457fa08467416e8535985f062859

This looks good to me as well. I have a patchset up to use upstream zstd 
directly in the kernel [0].
That will allow us to keep zstd up to date. And after that lands, I hope to set 
up a zstd linux tree
to make merging patches into lib/zstd easier, since over the years quite a few 
have been ignored.

[0] https://www.mail-archive.com/linux-kernel@vger.kernel.org/msg2532407.html

Best,
Nick Terrell

> Cheers,
> Miguel



Re: [GIT PULL][PATCH v9 0/3] Update to zstd-1.4.10

2021-04-09 Thread Nick Terrell
What can I do to help get this merged?

Cristoph, is this new patch series with the kernel wrapper API satisfactory?

Best,
Nick

On Tue, Mar 30, 2021 at 3:45 PM Nick Terrell  wrote:
>
> From: Nick Terrell 
>
> Please pull from
>
>   g...@github.com:terrelln/linux.git tags/v9-zstd-1.4.10
>
> to get these changes. Alternatively the patchset is included.
>
> This patchset upgrades the zstd library to the latest upstream release. The
> current zstd version in the kernel is a modified version of upstream 
> zstd-1.3.1.
> At the time it was integrated, zstd wasn't ready to be used in the kernel 
> as-is.
> But, it is now possible to use upstream zstd directly in the kernel.
>
> I have not yet released zstd-1.4.10 upstream. I want the zstd version in the
> kernel to match up with a known upstream release, so we know exactly what code
> is running. Whenever this patchset is ready for merge, I will cut a release at
> the upstream commit that gets merged. This should not be necessary for future
> releases.
>
> The kernel zstd library is automatically generated from upstream zstd. A 
> script
> makes the necessary changes and imports it into the kernel. The changes are:
>
> 1. Replace all libc dependencies with kernel replacements and rewrite 
> includes.
> 2. Remove unncessary portability macros like: #if defined(_MSC_VER).
> 3. Use the kernel xxhash instead of bundling it.
>
> This automation gets tested every commit by upstream's continuous integration.
> When we cut a new zstd release, we will submit a patch to the kernel to update
> the zstd version in the kernel.
>
> I've updated zstd to upstream with one big patch because every commit must 
> build,
> so that precludes partial updates. Since the commit is 100% generated, I hope 
> the
> review burden is lightened. I considered replaying upstream commits, but that 
> is
> not possible because there have been ~3500 upstream commits since the last 
> zstd
> import, and the commits don't all build individually. The bulk update 
> preserves
> bisectablity because bugs can be bisected to the zstd version update. At that
> point the update can be reverted, and we can work with upstream to find and 
> fix
> the bug. After this big switch in how the kernel consumes zstd, future patches
> will be smaller, because they will only have one upstream release worth of
> changes each.
>
> This patchset adds a new kernel-style wrapper around zstd. This wrapper API is
> functionally equivalent to the subset of the current zstd API that is 
> currently
> used. The wrapper API changes to be kernel style so that the symbols don't
> collide with zstd's symbols. The update to zstd-1.4.6 maintains the same API
> and preserves the semantics, so that none of the callers need to be updated.
>
> This patchset comes in 2 parts:
> 1. The first 2 patches prepare for the zstd upgrade. The first patch adds the
>new kernel style API so zstd can be upgraded without modifying any callers.
>The second patch adds an indirection for the lib/decompress_unzstd.c
>including of all decompression source files.
> 2. Import zstd-1.4.10. This patch is completely generated from upstream using
>automated tooling.
>
> I tested every caller of zstd on x86_64. I tested both after the 1.4.10 
> upgrade
> using the compatibility wrapper, and after the final patch in this series.
>
> I tested kernel and initramfs decompression in i386 and arm.
>
> I ran benchmarks to compare the current zstd in the kernel with zstd-1.4.6.
> I benchmarked on x86_64 using QEMU with KVM enabled on an Intel i9-9900k.
> I found:
> * BtrFS zstd compression at levels 1 and 3 is 5% faster
> * BtrFS zstd decompression+read is 15% faster
> * SquashFS zstd decompression+read is 15% faster
> * F2FS zstd compression+write at level 3 is 8% faster
> * F2FS zstd decompression+read is 20% faster
> * ZRAM decompression+read is 30% faster
> * Kernel zstd decompression is 35% faster
> * Initramfs zstd decompression+build is 5% faster
>
> The latest zstd also offers bug fixes. For example the problem with large 
> kernel
> decompression has been fixed upstream for over 2 years
> https://lkml.org/lkml/2020/9/29/27.
>
> Please let me know if there is anything that I can do to ease the way for 
> these
> patches. I think it is important because it gets large performance 
> improvements,
> contains bug fixes, and is switching to a more maintainable model of consuming
> upstream zstd directly, making it easy to keep up to date.
>
> Best,
> Nick Terrell
>
> v1 -> v2:
> * Successfully tested F2FS with help from Chao Yu to fix my test.
> * (1/9) Fix ZSTD_initCStream() wrapper to handle pledged_src_size=0 means 
>

Re: [PATCH] init: add support for zstd compressed modules

2021-04-07 Thread Nick Terrell


> On Apr 7, 2021, at 6:53 AM, Masahiro Yamada  wrote:
> 
> On Thu, Apr 1, 2021 at 4:21 AM Nick Terrell  wrote:
>> 
>> 
>> 
>>> On Mar 31, 2021, at 10:48 AM, Oleksandr Natalenko 
>>>  wrote:
>>> 
>>> Hello.
>>> 
>>> On Wed, Mar 31, 2021 at 05:39:25PM +, Nick Terrell wrote:
>>>> 
>>>> 
>>>>> On Mar 30, 2021, at 4:50 AM, Oleksandr Natalenko 
>>>>>  wrote:
>>>>> 
>>>>> On Tue, Mar 30, 2021 at 01:32:35PM +0200, Piotr Gorski wrote:
>>>>>> kmod 28 supports modules compressed in zstd format so let's add this 
>>>>>> possibility to kernel.
>>>>>> 
>>>>>> Signed-off-by: Piotr Gorski 
>>>>>> ---
>>>>>> Makefile | 7 +--
>>>>>> init/Kconfig | 9 ++---
>>>>>> 2 files changed, 11 insertions(+), 5 deletions(-)
>>>>>> 
>>>>>> diff --git a/Makefile b/Makefile
>>>>>> index 5160ff8903c1..82f4f4cc2955 100644
>>>>>> --- a/Makefile
>>>>>> +++ b/Makefile
>>>>>> @@ -1156,8 +1156,8 @@ endif # INSTALL_MOD_STRIP
>>>>>> export mod_strip_cmd
>>>>>> 
>>>>>> # CONFIG_MODULE_COMPRESS, if defined, will cause module to be compressed
>>>>>> -# after they are installed in agreement with CONFIG_MODULE_COMPRESS_GZIP
>>>>>> -# or CONFIG_MODULE_COMPRESS_XZ.
>>>>>> +# after they are installed in agreement with 
>>>>>> CONFIG_MODULE_COMPRESS_GZIP,
>>>>>> +# CONFIG_MODULE_COMPRESS_XZ, or CONFIG_MODULE_COMPRESS_ZSTD.
>>>>>> 
>>>>>> mod_compress_cmd = true
>>>>>> ifdef CONFIG_MODULE_COMPRESS
>>>>>> @@ -1167,6 +1167,9 @@ ifdef CONFIG_MODULE_COMPRESS
>>>>>> ifdef CONFIG_MODULE_COMPRESS_XZ
>>>>>>   mod_compress_cmd = $(XZ) --lzma2=dict=2MiB -f
>>>>>> endif # CONFIG_MODULE_COMPRESS_XZ
>>>>>> +  ifdef CONFIG_MODULE_COMPRESS_ZSTD
>>>>>> +mod_compress_cmd = $(ZSTD) -T0 --rm -f -q
>>>> 
>>>> This will use the default zstd level, level 3. I think it would make more 
>>>> sense to use a high
>>>> compression level. Level 19 would probably be a good choice. That will 
>>>> choose a window
>>>> size of up to 8MB, meaning the decompressor needs to allocate that much 
>>>> memory. If that
>>>> is unacceptable, you could use `zstd -T0 --rm -f -q -19 --zstd=wlog=21`, 
>>>> which will use a
>>>> window size of up to 2MB, to match the XZ command. Note that if the file 
>>>> is smaller than
>>>> the window size, it will be shrunk to the smallest power of two at least 
>>>> as large as the file.
>>> 
>>> Please no. We've already done that with initramfs in Arch, and it
>>> increased the time to generate it enormously.
>>> 
>>> I understand that building a kernel is a more rare operation than
>>> regenerating initramfs, but still I'd go against hard-coding the level.
>>> And if it should be specified anyway, I'd opt in for an explicit
>>> configuration option. Remember, not all the kernel are built on
>>> build farms...
>>> 
>>> FWIW, Piotr originally used level 9 which worked okay, but I insisted
>>> on sending the patch initially without specifying level at all like it is
>>> done for other compressors. If this is a wrong approach, then oh meh,
>>> mea culpa ;).
>>> 
>>> Whatever default non-standard compression level you choose, I'm fine
>>> as long as I can change it without editing Makefile.
>> 
>> That makes sense to me. I have a deep seated need to compress files as
>> efficiently as possible for widely distributed packages. But, I understand 
>> that
>> slow compression significantly impacts build times for quick iteration. I’d 
>> be
>> happy with a compression level parameter that defaults to a happy middle.
>> 
>> I’m also fine with taking this patch as-is if it is easier, and I can put up 
>> another
>> patch that adds a compression level parameter, since I don’t want to block
>> merging this.
> 
> 
> I do not want to take such a patch.
> Meeking everyone's requirement
> results in a bad project for everyone.
> 
> 
> Does this work for you?
> 
> make modules_install ZSTD="zstd -19"

Yeah, that’s perfect. Do y

Re: [PATCH] init: add support for zstd compressed modules

2021-04-01 Thread Nick Terrell



> On Apr 1, 2021, at 12:54 AM, torv...@mailbox.org wrote:
> 
> Thanks Piotr, good work!
> Question: Is `-T0` really faster in this particular case than the default 
> `-T1`? Are modules installed sequentially?

The zstd CLI produces deterministic output regardless of the number of threads 
used. `-T1` (or not specifying `-T`) will produce the same output as `-T0`. 
`-T0` will be faster for large files (at the default level, multiple jobs will 
be spawned for files > 8MB), and be just as fast as `-T1` for smaller files.

Best,
Nick

> I also saw that Masahiro did some work on modules_install, moving 
> MODULE_COMPRESS from the base Makefile to scripts/Makefile.modinst, so 
> perhaps this should also be moved there at a later point.
> 
> Tor Vic



Re: [PATCH] init: add support for zstd compressed modules

2021-03-31 Thread Nick Terrell


> On Mar 31, 2021, at 10:48 AM, Oleksandr Natalenko  
> wrote:
> 
> Hello.
> 
> On Wed, Mar 31, 2021 at 05:39:25PM +, Nick Terrell wrote:
>> 
>> 
>>> On Mar 30, 2021, at 4:50 AM, Oleksandr Natalenko  
>>> wrote:
>>> 
>>> On Tue, Mar 30, 2021 at 01:32:35PM +0200, Piotr Gorski wrote:
>>>> kmod 28 supports modules compressed in zstd format so let's add this 
>>>> possibility to kernel.
>>>> 
>>>> Signed-off-by: Piotr Gorski 
>>>> ---
>>>> Makefile | 7 +--
>>>> init/Kconfig | 9 ++---
>>>> 2 files changed, 11 insertions(+), 5 deletions(-)
>>>> 
>>>> diff --git a/Makefile b/Makefile
>>>> index 5160ff8903c1..82f4f4cc2955 100644
>>>> --- a/Makefile
>>>> +++ b/Makefile
>>>> @@ -1156,8 +1156,8 @@ endif # INSTALL_MOD_STRIP
>>>> export mod_strip_cmd
>>>> 
>>>> # CONFIG_MODULE_COMPRESS, if defined, will cause module to be compressed
>>>> -# after they are installed in agreement with CONFIG_MODULE_COMPRESS_GZIP
>>>> -# or CONFIG_MODULE_COMPRESS_XZ.
>>>> +# after they are installed in agreement with CONFIG_MODULE_COMPRESS_GZIP,
>>>> +# CONFIG_MODULE_COMPRESS_XZ, or CONFIG_MODULE_COMPRESS_ZSTD.
>>>> 
>>>> mod_compress_cmd = true
>>>> ifdef CONFIG_MODULE_COMPRESS
>>>> @@ -1167,6 +1167,9 @@ ifdef CONFIG_MODULE_COMPRESS
>>>>  ifdef CONFIG_MODULE_COMPRESS_XZ
>>>>mod_compress_cmd = $(XZ) --lzma2=dict=2MiB -f
>>>>  endif # CONFIG_MODULE_COMPRESS_XZ
>>>> +  ifdef CONFIG_MODULE_COMPRESS_ZSTD
>>>> +mod_compress_cmd = $(ZSTD) -T0 --rm -f -q
>> 
>> This will use the default zstd level, level 3. I think it would make more 
>> sense to use a high
>> compression level. Level 19 would probably be a good choice. That will 
>> choose a window
>> size of up to 8MB, meaning the decompressor needs to allocate that much 
>> memory. If that
>> is unacceptable, you could use `zstd -T0 --rm -f -q -19 --zstd=wlog=21`, 
>> which will use a
>> window size of up to 2MB, to match the XZ command. Note that if the file is 
>> smaller than
>> the window size, it will be shrunk to the smallest power of two at least as 
>> large as the file.
> 
> Please no. We've already done that with initramfs in Arch, and it
> increased the time to generate it enormously.
> 
> I understand that building a kernel is a more rare operation than
> regenerating initramfs, but still I'd go against hard-coding the level.
> And if it should be specified anyway, I'd opt in for an explicit
> configuration option. Remember, not all the kernel are built on
> build farms...
> 
> FWIW, Piotr originally used level 9 which worked okay, but I insisted
> on sending the patch initially without specifying level at all like it is
> done for other compressors. If this is a wrong approach, then oh meh,
> mea culpa ;).
> 
> Whatever default non-standard compression level you choose, I'm fine
> as long as I can change it without editing Makefile.

That makes sense to me. I have a deep seated need to compress files as
efficiently as possible for widely distributed packages. But, I understand that
slow compression significantly impacts build times for quick iteration. I’d be
happy with a compression level parameter that defaults to a happy middle.

I’m also fine with taking this patch as-is if it is easier, and I can put up 
another
patch that adds a compression level parameter, since I don’t want to block
merging this.

Best,
Nick Terrell

> Thanks!
> 
>> 
>> Best,
>> Nick Terrell
>> 
>>>> +  endif # CONFIG_MODULE_COMPRESS_ZSTD
>>>> endif # CONFIG_MODULE_COMPRESS
>>>> export mod_compress_cmd
>>>> 
>>>> diff --git a/init/Kconfig b/init/Kconfig
>>>> index 8c2cfd88f6ef..86a452bc2747 100644
>>>> --- a/init/Kconfig
>>>> +++ b/init/Kconfig
>>>> @@ -2250,8 +2250,8 @@ config MODULE_COMPRESS
>>>>bool "Compress modules on installation"
>>>>help
>>>> 
>>>> -Compresses kernel modules when 'make modules_install' is run; gzip or
>>>> -xz depending on "Compression algorithm" below.
>>>> +Compresses kernel modules when 'make modules_install' is run; gzip,
>>>> +xz, or zstd depending on "Compression algorithm" below.
>>>> 
>>>>  module-init-tools MAY support gzip, and kmod MAY support gzip and xz.
>>>> 
>>>> @@ -2

Re: [PATCH] init: add support for zstd compressed modules

2021-03-31 Thread Nick Terrell



> On Mar 30, 2021, at 4:50 AM, Oleksandr Natalenko  
> wrote:
> 
> On Tue, Mar 30, 2021 at 01:32:35PM +0200, Piotr Gorski wrote:
>> kmod 28 supports modules compressed in zstd format so let's add this 
>> possibility to kernel.
>> 
>> Signed-off-by: Piotr Gorski 
>> ---
>> Makefile | 7 +--
>> init/Kconfig | 9 ++---
>> 2 files changed, 11 insertions(+), 5 deletions(-)
>> 
>> diff --git a/Makefile b/Makefile
>> index 5160ff8903c1..82f4f4cc2955 100644
>> --- a/Makefile
>> +++ b/Makefile
>> @@ -1156,8 +1156,8 @@ endif # INSTALL_MOD_STRIP
>> export mod_strip_cmd
>> 
>> # CONFIG_MODULE_COMPRESS, if defined, will cause module to be compressed
>> -# after they are installed in agreement with CONFIG_MODULE_COMPRESS_GZIP
>> -# or CONFIG_MODULE_COMPRESS_XZ.
>> +# after they are installed in agreement with CONFIG_MODULE_COMPRESS_GZIP,
>> +# CONFIG_MODULE_COMPRESS_XZ, or CONFIG_MODULE_COMPRESS_ZSTD.
>> 
>> mod_compress_cmd = true
>> ifdef CONFIG_MODULE_COMPRESS
>> @@ -1167,6 +1167,9 @@ ifdef CONFIG_MODULE_COMPRESS
>>   ifdef CONFIG_MODULE_COMPRESS_XZ
>> mod_compress_cmd = $(XZ) --lzma2=dict=2MiB -f
>>   endif # CONFIG_MODULE_COMPRESS_XZ
>> +  ifdef CONFIG_MODULE_COMPRESS_ZSTD
>> +mod_compress_cmd = $(ZSTD) -T0 --rm -f -q

This will use the default zstd level, level 3. I think it would make more sense 
to use a high
compression level. Level 19 would probably be a good choice. That will choose a 
window
size of up to 8MB, meaning the decompressor needs to allocate that much memory. 
If that
is unacceptable, you could use `zstd -T0 --rm -f -q -19 --zstd=wlog=21`, which 
will use a
window size of up to 2MB, to match the XZ command. Note that if the file is 
smaller than
the window size, it will be shrunk to the smallest power of two at least as 
large as the file.

Best,
Nick Terrell

>> +  endif # CONFIG_MODULE_COMPRESS_ZSTD
>> endif # CONFIG_MODULE_COMPRESS
>> export mod_compress_cmd
>> 
>> diff --git a/init/Kconfig b/init/Kconfig
>> index 8c2cfd88f6ef..86a452bc2747 100644
>> --- a/init/Kconfig
>> +++ b/init/Kconfig
>> @@ -2250,8 +2250,8 @@ config MODULE_COMPRESS
>>  bool "Compress modules on installation"
>>  help
>> 
>> -  Compresses kernel modules when 'make modules_install' is run; gzip or
>> -  xz depending on "Compression algorithm" below.
>> +  Compresses kernel modules when 'make modules_install' is run; gzip,
>> +  xz, or zstd depending on "Compression algorithm" below.
>> 
>>module-init-tools MAY support gzip, and kmod MAY support gzip and xz.
>> 
>> @@ -2273,7 +2273,7 @@ choice
>>This determines which sort of compression will be used during
>>'make modules_install'.
>> 
>> -  GZIP (default) and XZ are supported.
>> +  GZIP (default), XZ, and ZSTD are supported.
>> 
>> config MODULE_COMPRESS_GZIP
>>  bool "GZIP"
>> @@ -2281,6 +2281,9 @@ config MODULE_COMPRESS_GZIP
>> config MODULE_COMPRESS_XZ
>>  bool "XZ"
>> 
>> +config MODULE_COMPRESS_ZSTD
>> +bool "ZSTD"
>> +
>> endchoice
>> 
>> config MODULE_ALLOW_MISSING_NAMESPACE_IMPORTS
>> -- 
>> 2.31.0.97.g1424303384
>> 
> 
> Great!
> 
> Reviewed-by: Oleksandr Natalenko 
> 
> This works perfectly fine in Arch Linux if accompanied by the
> following mkinitcpio amendment: [1].
> 
> I'm also Cc'ing other people from get_maintainers output just
> to make this submission more visible.
> 
> Thanks.
> 
> [1] https://github.com/archlinux/mkinitcpio/pull/43
> 
> -- 
>  Oleksandr Natalenko (post-factum)



[PATCH v9 2/3] lib: zstd: Add decompress_sources.h for decompress_unzstd

2021-03-30 Thread Nick Terrell
From: Nick Terrell 

Adds decompress_sources.h which includes every .c file necessary for
zstd decompression. This is used in decompress_unzstd.c so the internal
structure of the library isn't exposed.

This allows us to upgrade the zstd library version without modifying any
callers. Instead we just need to update decompress_sources.h.

Signed-off-by: Nick Terrell 
---
 lib/decompress_unzstd.c   |  6 +-
 lib/zstd/decompress_sources.h | 23 +++
 2 files changed, 24 insertions(+), 5 deletions(-)
 create mode 100644 lib/zstd/decompress_sources.h

diff --git a/lib/decompress_unzstd.c b/lib/decompress_unzstd.c
index c88aad49e996..6e5ecfba0a8d 100644
--- a/lib/decompress_unzstd.c
+++ b/lib/decompress_unzstd.c
@@ -68,11 +68,7 @@
 #ifdef STATIC
 # define UNZSTD_PREBOOT
 # include "xxhash.c"
-# include "zstd/entropy_common.c"
-# include "zstd/fse_decompress.c"
-# include "zstd/huf_decompress.c"
-# include "zstd/zstd_common.c"
-# include "zstd/decompress.c"
+# include "zstd/decompress_sources.h"
 #endif
 
 #include 
diff --git a/lib/zstd/decompress_sources.h b/lib/zstd/decompress_sources.h
new file mode 100644
index ..d82cea4316f5
--- /dev/null
+++ b/lib/zstd/decompress_sources.h
@@ -0,0 +1,23 @@
+/* SPDX-License-Identifier: GPL-2.0-only */
+/*
+ * Copyright (c) Facebook, Inc.
+ * All rights reserved.
+ *
+ * This source code is licensed under both the BSD-style license (found in the
+ * LICENSE file in the root directory of this source tree) and the GPLv2 (found
+ * in the COPYING file in the root directory of this source tree).
+ * You may select, at your option, one of the above-listed licenses.
+ */
+
+/*
+ * This file includes every .c file needed for decompression.
+ * It is used by lib/decompress_unzstd.c to include the decompression
+ * source into the translation-unit, so it can be used for kernel
+ * decompression.
+ */
+
+#include "entropy_common.c"
+#include "fse_decompress.c"
+#include "huf_decompress.c"
+#include "zstd_common.c"
+#include "decompress.c"
-- 
2.31.0



[PATCH v9 1/3] lib: zstd: Add kernel-specific API

2021-03-30 Thread Nick Terrell
From: Nick Terrell 

This patch:
- Moves `include/linux/zstd.h` -> `include/linux/zstd_lib.h`
- Updates modified zstd headers to yearless copyright
- Adds a new API in `include/linux/zstd.h` that is functionally
  equivalent to the in-use subset of the current API. Functions are
  renamed to avoid symbol collisions with zstd, to make it clear it is
  not the upstream zstd API, and to follow the kernel style guide.
- Updates all callers to use the new API.

There are no functional changes in this patch. Since there are no
functional change, I felt it was okay to update all the callers in a
single patch. Once the API is approved, the callers are mechanically
changed.

This patch is preparing for the 3rd patch in this series, which updates
zstd to version 1.4.10. Since the upstream zstd API is no longer exposed
to callers, the update can happen transparently.

Signed-off-by: Nick Terrell 
---
 crypto/zstd.c  |   28 +-
 fs/btrfs/zstd.c|   68 +-
 fs/f2fs/compress.c |   56 +-
 fs/f2fs/super.c|2 +-
 fs/pstore/platform.c   |2 +-
 fs/squashfs/zstd_wrapper.c |   16 +-
 include/linux/zstd.h   | 1243 
 include/linux/zstd_lib.h   | 1157 +
 lib/decompress_unzstd.c|   42 +-
 lib/zstd/compress.c|  123 ++--
 lib/zstd/decompress.c  |  112 ++--
 11 files changed, 1691 insertions(+), 1158 deletions(-)
 create mode 100644 include/linux/zstd_lib.h

diff --git a/crypto/zstd.c b/crypto/zstd.c
index 1a3309f066f7..154a969c83a8 100644
--- a/crypto/zstd.c
+++ b/crypto/zstd.c
@@ -18,22 +18,22 @@
 #define ZSTD_DEF_LEVEL 3
 
 struct zstd_ctx {
-   ZSTD_CCtx *cctx;
-   ZSTD_DCtx *dctx;
+   zstd_cctx *cctx;
+   zstd_dctx *dctx;
void *cwksp;
void *dwksp;
 };
 
-static ZSTD_parameters zstd_params(void)
+static zstd_parameters zstd_params(void)
 {
-   return ZSTD_getParams(ZSTD_DEF_LEVEL, 0, 0);
+   return zstd_get_params(ZSTD_DEF_LEVEL, 0);
 }
 
 static int zstd_comp_init(struct zstd_ctx *ctx)
 {
int ret = 0;
-   const ZSTD_parameters params = zstd_params();
-   const size_t wksp_size = ZSTD_CCtxWorkspaceBound(params.cParams);
+   const zstd_parameters params = zstd_params();
+   const size_t wksp_size = zstd_cctx_workspace_bound();
 
ctx->cwksp = vzalloc(wksp_size);
if (!ctx->cwksp) {
@@ -41,7 +41,7 @@ static int zstd_comp_init(struct zstd_ctx *ctx)
goto out;
}
 
-   ctx->cctx = ZSTD_initCCtx(ctx->cwksp, wksp_size);
+   ctx->cctx = zstd_init_cctx(ctx->cwksp, wksp_size);
if (!ctx->cctx) {
ret = -EINVAL;
goto out_free;
@@ -56,7 +56,7 @@ static int zstd_comp_init(struct zstd_ctx *ctx)
 static int zstd_decomp_init(struct zstd_ctx *ctx)
 {
int ret = 0;
-   const size_t wksp_size = ZSTD_DCtxWorkspaceBound();
+   const size_t wksp_size = zstd_dctx_workspace_bound();
 
ctx->dwksp = vzalloc(wksp_size);
if (!ctx->dwksp) {
@@ -64,7 +64,7 @@ static int zstd_decomp_init(struct zstd_ctx *ctx)
goto out;
}
 
-   ctx->dctx = ZSTD_initDCtx(ctx->dwksp, wksp_size);
+   ctx->dctx = zstd_init_dctx(ctx->dwksp, wksp_size);
if (!ctx->dctx) {
ret = -EINVAL;
goto out_free;
@@ -152,10 +152,10 @@ static int __zstd_compress(const u8 *src, unsigned int 
slen,
 {
size_t out_len;
struct zstd_ctx *zctx = ctx;
-   const ZSTD_parameters params = zstd_params();
+   const zstd_parameters params = zstd_params();
 
-   out_len = ZSTD_compressCCtx(zctx->cctx, dst, *dlen, src, slen, params);
-   if (ZSTD_isError(out_len))
+   out_len = zstd_compress_cctx(zctx->cctx, dst, *dlen, src, slen, 
);
+   if (zstd_is_error(out_len))
return -EINVAL;
*dlen = out_len;
return 0;
@@ -182,8 +182,8 @@ static int __zstd_decompress(const u8 *src, unsigned int 
slen,
size_t out_len;
struct zstd_ctx *zctx = ctx;
 
-   out_len = ZSTD_decompressDCtx(zctx->dctx, dst, *dlen, src, slen);
-   if (ZSTD_isError(out_len))
+   out_len = zstd_decompress_dctx(zctx->dctx, dst, *dlen, src, slen);
+   if (zstd_is_error(out_len))
return -EINVAL;
*dlen = out_len;
return 0;
diff --git a/fs/btrfs/zstd.c b/fs/btrfs/zstd.c
index 8e9626d63976..14418b02c189 100644
--- a/fs/btrfs/zstd.c
+++ b/fs/btrfs/zstd.c
@@ -28,10 +28,10 @@
 /* 307s to avoid pathologically clashing with transaction commit */
 #define ZSTD_BTRFS_RECLAIM_JIFFIES (307 * HZ)
 
-static ZSTD_parameters zstd_get_btrfs_parameters(unsigned int level,
+static zstd_parameters zstd_get_btrfs_parameters(unsigned int level,
 size_t src_len)
 {
-   ZSTD_parameters params = ZSTD_getParams(level, src_len, 0);
+ 

[GIT PULL][PATCH v9 0/3] Update to zstd-1.4.10

2021-03-30 Thread Nick Terrell
From: Nick Terrell 

Please pull from

  g...@github.com:terrelln/linux.git tags/v9-zstd-1.4.10

to get these changes. Alternatively the patchset is included.

This patchset upgrades the zstd library to the latest upstream release. The
current zstd version in the kernel is a modified version of upstream zstd-1.3.1.
At the time it was integrated, zstd wasn't ready to be used in the kernel as-is.
But, it is now possible to use upstream zstd directly in the kernel.

I have not yet released zstd-1.4.10 upstream. I want the zstd version in the
kernel to match up with a known upstream release, so we know exactly what code
is running. Whenever this patchset is ready for merge, I will cut a release at
the upstream commit that gets merged. This should not be necessary for future
releases.

The kernel zstd library is automatically generated from upstream zstd. A script
makes the necessary changes and imports it into the kernel. The changes are:

1. Replace all libc dependencies with kernel replacements and rewrite includes.
2. Remove unncessary portability macros like: #if defined(_MSC_VER).
3. Use the kernel xxhash instead of bundling it.

This automation gets tested every commit by upstream's continuous integration.
When we cut a new zstd release, we will submit a patch to the kernel to update
the zstd version in the kernel.

I've updated zstd to upstream with one big patch because every commit must 
build,
so that precludes partial updates. Since the commit is 100% generated, I hope 
the
review burden is lightened. I considered replaying upstream commits, but that is
not possible because there have been ~3500 upstream commits since the last zstd
import, and the commits don't all build individually. The bulk update preserves
bisectablity because bugs can be bisected to the zstd version update. At that
point the update can be reverted, and we can work with upstream to find and fix
the bug. After this big switch in how the kernel consumes zstd, future patches
will be smaller, because they will only have one upstream release worth of
changes each.

This patchset adds a new kernel-style wrapper around zstd. This wrapper API is
functionally equivalent to the subset of the current zstd API that is currently
used. The wrapper API changes to be kernel style so that the symbols don't
collide with zstd's symbols. The update to zstd-1.4.6 maintains the same API
and preserves the semantics, so that none of the callers need to be updated.

This patchset comes in 2 parts:
1. The first 2 patches prepare for the zstd upgrade. The first patch adds the
   new kernel style API so zstd can be upgraded without modifying any callers.
   The second patch adds an indirection for the lib/decompress_unzstd.c
   including of all decompression source files.
2. Import zstd-1.4.10. This patch is completely generated from upstream using
   automated tooling.

I tested every caller of zstd on x86_64. I tested both after the 1.4.10 upgrade
using the compatibility wrapper, and after the final patch in this series.

I tested kernel and initramfs decompression in i386 and arm.

I ran benchmarks to compare the current zstd in the kernel with zstd-1.4.6.
I benchmarked on x86_64 using QEMU with KVM enabled on an Intel i9-9900k.
I found:
* BtrFS zstd compression at levels 1 and 3 is 5% faster
* BtrFS zstd decompression+read is 15% faster
* SquashFS zstd decompression+read is 15% faster
* F2FS zstd compression+write at level 3 is 8% faster
* F2FS zstd decompression+read is 20% faster
* ZRAM decompression+read is 30% faster
* Kernel zstd decompression is 35% faster
* Initramfs zstd decompression+build is 5% faster

The latest zstd also offers bug fixes. For example the problem with large kernel
decompression has been fixed upstream for over 2 years
https://lkml.org/lkml/2020/9/29/27.

Please let me know if there is anything that I can do to ease the way for these
patches. I think it is important because it gets large performance improvements,
contains bug fixes, and is switching to a more maintainable model of consuming
upstream zstd directly, making it easy to keep up to date.

Best,
Nick Terrell

v1 -> v2:
* Successfully tested F2FS with help from Chao Yu to fix my test.
* (1/9) Fix ZSTD_initCStream() wrapper to handle pledged_src_size=0 means 
unknown.
  This fixes F2FS with the zstd-1.4.6 compatibility wrapper, exposed by the 
test.

v2 -> v3:
* (3/9) Silence warnings by Kernel Test Robot:
  https://github.com/facebook/zstd/pull/2324
  Stack size warnings remain, but these aren't new, and the functions it warns 
on
  are either unused or not in the maximum stack path. This patchset reduces zstd
  compression stack usage by 1 KB overall. I've gotten the low hanging fruit, 
and
  more stack reduction would require significant changes that have the potential
  to introduce new bugs. However, I do hope to continue to reduce zstd stack
  usage in future versions.

v3 -> v4:
* (3/9) Fix errors and warnings reported by Kernel Test Robot

Re: [PATCH v8 1/3] lib: zstd: Add kernel-specific API

2021-03-28 Thread Nick Terrell
On Sat, Mar 27, 2021 at 2:48 PM Oleksandr Natalenko
 wrote:
>
> Hello.
>
> On Sat, Mar 27, 2021 at 05:48:01PM +0800, kernel test robot wrote:
> > >> ERROR: modpost: "ZSTD_maxCLevel" [fs/f2fs/f2fs.ko] undefined!
>
> Since f2fs can be built as a module, the following correction seems to
> be needed:

Thanks Oleksandr! Looks like f2fs has been updated to use
ZSTD_maxCLevel() since the first version of these patches. I'll put up
a new version shortly with the fix, and update my test suite to build
f2fs and other users as modules, so it can catch this.

Best,
Nick

> ```
> diff --git a/lib/zstd/compress/zstd_compress.c 
> b/lib/zstd/compress/zstd_compress.c
> index 9c998052a0e5..584c92c51169 100644
> --- a/lib/zstd/compress/zstd_compress.c
> +++ b/lib/zstd/compress/zstd_compress.c
> @@ -4860,6 +4860,7 @@ size_t ZSTD_endStream(ZSTD_CStream* zcs, 
> ZSTD_outBuffer* output)
>
>  #define ZSTD_MAX_CLEVEL 22
>  int ZSTD_maxCLevel(void) { return ZSTD_MAX_CLEVEL; }
> +EXPORT_SYMBOL(ZSTD_maxCLevel);
>  int ZSTD_minCLevel(void) { return (int)-ZSTD_TARGETLENGTH_MAX; }
>
>  static const ZSTD_compressionParameters 
> ZSTD_defaultCParameters[4][ZSTD_MAX_CLEVEL+1] = {
> ```
>
> Not sure if the same should be done for `ZSTD_minCLevel()` since I don't
> see it being used anywhere else.
>
> --
>   Oleksandr Natalenko (post-factum)


Re: [PATCH v8 3/3] lib: zstd: Upgrade to latest upstream zstd version 1.4.10

2021-03-26 Thread Nick Terrell
On Fri, Mar 26, 2021 at 3:02 PM kernel test robot  wrote:
>
> Hi Nick,
>
> Thank you for the patch! Perhaps something to improve:
>
> [auto build test WARNING on cryptodev/master]
> [also build test WARNING on kdave/for-next f2fs/dev-test linus/master 
> v5.12-rc4 next-20210326]
> [cannot apply to crypto/master kees/for-next/pstore squashfs/master]
> [If your patch is applied to the wrong git tree, kindly drop us a note.
> And when submitting patch, we suggest to use '--base' as documented in
> https://git-scm.com/docs/git-format-patch]
>
> url:
> https://github.com/0day-ci/linux/commits/Nick-Terrell/Update-to-zstd-1-4-10/20210327-031827
> base:   
> https://git.kernel.org/pub/scm/linux/kernel/git/herbert/cryptodev-2.6.git 
> master
> config: um-allmodconfig (attached as .config)
> compiler: gcc-9 (Debian 9.3.0-22) 9.3.0
> reproduce (this is a W=1 build):
> # 
> https://github.com/0day-ci/linux/commit/ebbff13fa6a537fb8b3dc6b42c3093f9ce4358f8
> git remote add linux-review https://github.com/0day-ci/linux
> git fetch --no-tags linux-review 
> Nick-Terrell/Update-to-zstd-1-4-10/20210327-031827
> git checkout ebbff13fa6a537fb8b3dc6b42c3093f9ce4358f8
> # save the attached .config to linux build tree
> make W=1 ARCH=um
>
> If you fix the issue, kindly add following tag as appropriate
> Reported-by: kernel test robot 
>
> All warnings (new ones prefixed by >>):
>
>lib/zstd/compress/zstd_compress_sequences.c:17: warning: Cannot understand 
>  * -log2(x / 256) lookup table for x in [0, 256).
> on line 17 - I thought it was a doc line
>lib/zstd/compress/zstd_compress_sequences.c:58: warning: Function 
> parameter or member 'nbSeq' not described in 'ZSTD_useLowProbCount'
> >> lib/zstd/compress/zstd_compress_sequences.c:58: warning: expecting 
> >> prototype for 1 else we should(). Prototype was for ZSTD_useLowProbCount() 
> >> instead
> >> lib/zstd/compress/zstd_compress_sequences.c:67: warning: wrong kernel-doc 
> >> identifier on line:
> * Returns the cost in bytes of encoding the normalized count header.
>lib/zstd/compress/zstd_compress_sequences.c:85: warning: Function 
> parameter or member 'count' not described in 'ZSTD_entropyCost'
>lib/zstd/compress/zstd_compress_sequences.c:85: warning: Function 
> parameter or member 'max' not described in 'ZSTD_entropyCost'
>lib/zstd/compress/zstd_compress_sequences.c:85: warning: Function 
> parameter or member 'total' not described in 'ZSTD_entropyCost'
> >> lib/zstd/compress/zstd_compress_sequences.c:85: warning: expecting 
> >> prototype for Returns the cost in bits of encoding the distribution 
> >> described by count(). Prototype was for ZSTD_entropyCost() instead
>lib/zstd/compress/zstd_compress_sequences.c:99: warning: wrong kernel-doc 
> identifier on line:
> * Returns the cost in bits of encoding the distribution in count using 
> ctable.
>lib/zstd/compress/zstd_compress_sequences.c:139: warning: Function 
> parameter or member 'norm' not described in 'ZSTD_crossEntropyCost'
>lib/zstd/compress/zstd_compress_sequences.c:139: warning: Function 
> parameter or member 'accuracyLog' not described in 'ZSTD_crossEntropyCost'
>lib/zstd/compress/zstd_compress_sequences.c:139: warning: Function 
> parameter or member 'count' not described in 'ZSTD_crossEntropyCost'
>lib/zstd/compress/zstd_compress_sequences.c:139: warning: Function 
> parameter or member 'max' not described in 'ZSTD_crossEntropyCost'
> >> lib/zstd/compress/zstd_compress_sequences.c:139: warning: expecting 
> >> prototype for Returns the cost in bits of encoding the distribution in 
> >> count using the(). Prototype was for ZSTD_crossEntropyCost() instead
> --
>lib/zstd/compress/zstd_ldm.c:584: warning: Function parameter or member 
> 'rawSeqStore' not described in 'maybeSplitSequence'
>lib/zstd/compress/zstd_ldm.c:584: warning: Function parameter or member 
> 'remaining' not described in 'maybeSplitSequence'
>lib/zstd/compress/zstd_ldm.c:584: warning: Function parameter or member 
> 'minMatch' not described in 'maybeSplitSequence'
> >> lib/zstd/compress/zstd_ldm.c:584: warning: expecting prototype for If the 
> >> sequence length is longer than remaining then the sequence is split(). 
> >> Prototype was for maybeSplitSequence() instead
> --
> >> lib/zstd/decompress/zstd_decompress.c:992: warning: wrong kernel-doc 
> >> identifier on line:
> * Similar to ZSTD_nextSrcSizeToDecompress(), but when when a block input 
> can be streamed,
> --
>lib/zstd/decompress/huf_decompress.c:122: warning: Function parameter or 
> member 'symb

[PATCH v8 2/3] lib: zstd: Add decompress_sources.h for decompress_unzstd

2021-03-26 Thread Nick Terrell
From: Nick Terrell 

Adds decompress_sources.h which includes every .c file necessary for
zstd decompression. This is used in decompress_unzstd.c so the internal
structure of the library isn't exposed.

This allows us to upgrade the zstd library version without modifying any
callers. Instead we just need to update decompress_sources.h.

Signed-off-by: Nick Terrell 
---
 lib/decompress_unzstd.c   |  6 +-
 lib/zstd/decompress_sources.h | 14 ++
 2 files changed, 15 insertions(+), 5 deletions(-)
 create mode 100644 lib/zstd/decompress_sources.h

diff --git a/lib/decompress_unzstd.c b/lib/decompress_unzstd.c
index dab2d55cf08d..e6897a5063a7 100644
--- a/lib/decompress_unzstd.c
+++ b/lib/decompress_unzstd.c
@@ -68,11 +68,7 @@
 #ifdef STATIC
 # define UNZSTD_PREBOOT
 # include "xxhash.c"
-# include "zstd/entropy_common.c"
-# include "zstd/fse_decompress.c"
-# include "zstd/huf_decompress.c"
-# include "zstd/zstd_common.c"
-# include "zstd/decompress.c"
+# include "zstd/decompress_sources.h"
 #endif
 
 #include 
diff --git a/lib/zstd/decompress_sources.h b/lib/zstd/decompress_sources.h
new file mode 100644
index ..d2fe10af0043
--- /dev/null
+++ b/lib/zstd/decompress_sources.h
@@ -0,0 +1,14 @@
+/* SPDX-License-Identifier: GPL-2.0-only */
+
+/*
+ * This file includes every .c file needed for decompression.
+ * It is used by lib/decompress_unzstd.c to include the decompression
+ * source into the translation-unit, so it can be used for kernel
+ * decompression.
+ */
+
+#include "entropy_common.c"
+#include "fse_decompress.c"
+#include "huf_decompress.c"
+#include "zstd_common.c"
+#include "decompress.c"
-- 
2.31.0



[PATCH v8 0/3] Update to zstd-1.4.10

2021-03-26 Thread Nick Terrell
From: Nick Terrell 

Please pull from

  g...@github.com:terrelln/linux.git tags/v8-zstd-1.4.10

to get these changes. Alternatively the patchset is included.

This patchset upgrades the zstd library to the latest upstream release. The
current zstd version in the kernel is a modified version of upstream zstd-1.3.1.
At the time it was integrated, zstd wasn't ready to be used in the kernel as-is.
But, it is now possible to use upstream zstd directly in the kernel.

I have not yet released zstd-1.4.10 upstream. I want the zstd version in the
kernel to match up with a known upstream release, so we know exactly what code
is running. Whenever this patchset is ready for merge, I will cut a release at
the upstream commit that gets merged. This should not be necessary for future
releases.

The kernel zstd library is automatically generated from upstream zstd. A script
makes the necessary changes and imports it into the kernel. The changes are:

1. Replace all libc dependencies with kernel replacements and rewrite includes.
2. Remove unncessary portability macros like: #if defined(_MSC_VER).
3. Use the kernel xxhash instead of bundling it.

This automation gets tested every commit by upstream's continuous integration.
When we cut a new zstd release, we will submit a patch to the kernel to update
the zstd version in the kernel.

I've updated zstd to upstream with one big patch because every commit must 
build,
so that precludes partial updates. Since the commit is 100% generated, I hope 
the
review burden is lightened. I considered replaying upstream commits, but that is
not possible because there have been ~3500 upstream commits since the last zstd
import, and the commits don't all build individually. The bulk update preserves
bisectablity because bugs can be bisected to the zstd version update. At that
point the update can be reverted, and we can work with upstream to find and fix
the bug. After this big switch in how the kernel consumes zstd, future patches
will be smaller, because they will only have one upstream release worth of
changes each.

This patchset adds a new kernel-style wrapper around zstd. This wrapper API is
functionally equivalent to the subset of the current zstd API that is currently
used. The wrapper API changes to be kernel style so that the symbols don't
collide with zstd's symbols. The update to zstd-1.4.10 maintains the same API
and preserves the semantics, so that none of the callers need to be updated.

This patchset comes in 2 parts:
1. The first 2 patches prepare for the zstd upgrade. The first patch adds the
   new kernel style API so zstd can be upgraded without modifying any callers.
   The second patch adds an indirection for the lib/decompress_unzstd.c
   including of all decompression source files.
2. Import zstd-1.4.10. This patch is completely generated from upstream using
   automated tooling.

I tested every caller of zstd on x86_64. I tested both after the 1.4.10 upgrade
using the compatibility wrapper, and after the final patch in this series.

I tested kernel and initramfs decompression in i386 and arm.

I ran benchmarks to compare the current zstd in the kernel with zstd-1.4.10.
I benchmarked on x86_64 using QEMU with KVM enabled on an Intel i9-9900k.
I found:
* BtrFS zstd compression at levels 1 and 3 is 5% faster
* BtrFS zstd decompression+read is 15% faster
* SquashFS zstd decompression+read is 15% faster
* F2FS zstd compression+write at level 3 is 8% faster
* F2FS zstd decompression+read is 20% faster
* ZRAM decompression+read is 30% faster
* Kernel zstd decompression is 35% faster
* Initramfs zstd decompression+build is 5% faster

The latest zstd also offers bug fixes. For example the problem with large kernel
decompression has been fixed upstream for over 2 years
https://lkml.org/lkml/2020/9/29/27.

Please let me know if there is anything that I can do to ease the way for these
patches. I think it is important because it gets large performance improvements,
contains bug fixes, and is switching to a more maintainable model of consuming
upstream zstd directly, making it easy to keep up to date.

Best,
Nick Terrell

v1 -> v2:
* Successfully tested F2FS with help from Chao Yu to fix my test.
* (1/9) Fix ZSTD_initCStream() wrapper to handle pledged_src_size=0 means 
unknown.
  This fixes F2FS with the zstd-1.4.6 compatibility wrapper, exposed by the 
test.

v2 -> v3:
* (3/9) Silence warnings by Kernel Test Robot:
  https://github.com/facebook/zstd/pull/2324
  Stack size warnings remain, but these aren't new, and the functions it warns 
on
  are either unused or not in the maximum stack path. This patchset reduces zstd
  compression stack usage by 1 KB overall. I've gotten the low hanging fruit, 
and
  more stack reduction would require significant changes that have the potential
  to introduce new bugs. However, I do hope to continue to reduce zstd stack
  usage in future versions.

v3 -> v4:
* (3/9) Fix errors and warnings reported by Kernel Test Robot

[PATCH v8 1/3] lib: zstd: Add kernel-specific API

2021-03-26 Thread Nick Terrell
From: Nick Terrell 

This patch:
- Moves `include/linux/zstd.h` -> `include/linux/zstd_lib.h`
- Adds a new API in `include/linux/zstd.h` that is functionally
  equivalent to the in-use subset of the current API. Functions are
  renamed to avoid symbol collisions with zstd, to make it clear it is
  not the upstream zstd API, and to follow the kernel style guide.
- Updates all callers to use the new API.

There are no functional changes in this patch. Since there are no
functional change, I felt it was okay to update all the callers in a
single patch. Once the API is approved, the callers are mechanically
changed.

This patch is preparing for the 3rd patch in this series, which updates
zstd to version 1.4.10. Since the upstream zstd API is no longer exposed
to callers, the update can happen transparently.

Signed-off-by: Nick Terrell 
---
 crypto/zstd.c  |   28 +-
 fs/btrfs/zstd.c|   68 +-
 fs/f2fs/compress.c |   56 +-
 fs/pstore/platform.c   |2 +-
 fs/squashfs/zstd_wrapper.c |   16 +-
 include/linux/zstd.h   | 1218 
 include/linux/zstd_lib.h   | 1157 ++
 lib/decompress_unzstd.c|   42 +-
 lib/zstd/compress.c|  107 ++--
 lib/zstd/decompress.c  |  112 ++--
 10 files changed, 1657 insertions(+), 1149 deletions(-)
 create mode 100644 include/linux/zstd_lib.h

diff --git a/crypto/zstd.c b/crypto/zstd.c
index 1a3309f066f7..154a969c83a8 100644
--- a/crypto/zstd.c
+++ b/crypto/zstd.c
@@ -18,22 +18,22 @@
 #define ZSTD_DEF_LEVEL 3
 
 struct zstd_ctx {
-   ZSTD_CCtx *cctx;
-   ZSTD_DCtx *dctx;
+   zstd_cctx *cctx;
+   zstd_dctx *dctx;
void *cwksp;
void *dwksp;
 };
 
-static ZSTD_parameters zstd_params(void)
+static zstd_parameters zstd_params(void)
 {
-   return ZSTD_getParams(ZSTD_DEF_LEVEL, 0, 0);
+   return zstd_get_params(ZSTD_DEF_LEVEL, 0);
 }
 
 static int zstd_comp_init(struct zstd_ctx *ctx)
 {
int ret = 0;
-   const ZSTD_parameters params = zstd_params();
-   const size_t wksp_size = ZSTD_CCtxWorkspaceBound(params.cParams);
+   const zstd_parameters params = zstd_params();
+   const size_t wksp_size = zstd_cctx_workspace_bound();
 
ctx->cwksp = vzalloc(wksp_size);
if (!ctx->cwksp) {
@@ -41,7 +41,7 @@ static int zstd_comp_init(struct zstd_ctx *ctx)
goto out;
}
 
-   ctx->cctx = ZSTD_initCCtx(ctx->cwksp, wksp_size);
+   ctx->cctx = zstd_init_cctx(ctx->cwksp, wksp_size);
if (!ctx->cctx) {
ret = -EINVAL;
goto out_free;
@@ -56,7 +56,7 @@ static int zstd_comp_init(struct zstd_ctx *ctx)
 static int zstd_decomp_init(struct zstd_ctx *ctx)
 {
int ret = 0;
-   const size_t wksp_size = ZSTD_DCtxWorkspaceBound();
+   const size_t wksp_size = zstd_dctx_workspace_bound();
 
ctx->dwksp = vzalloc(wksp_size);
if (!ctx->dwksp) {
@@ -64,7 +64,7 @@ static int zstd_decomp_init(struct zstd_ctx *ctx)
goto out;
}
 
-   ctx->dctx = ZSTD_initDCtx(ctx->dwksp, wksp_size);
+   ctx->dctx = zstd_init_dctx(ctx->dwksp, wksp_size);
if (!ctx->dctx) {
ret = -EINVAL;
goto out_free;
@@ -152,10 +152,10 @@ static int __zstd_compress(const u8 *src, unsigned int 
slen,
 {
size_t out_len;
struct zstd_ctx *zctx = ctx;
-   const ZSTD_parameters params = zstd_params();
+   const zstd_parameters params = zstd_params();
 
-   out_len = ZSTD_compressCCtx(zctx->cctx, dst, *dlen, src, slen, params);
-   if (ZSTD_isError(out_len))
+   out_len = zstd_compress_cctx(zctx->cctx, dst, *dlen, src, slen, 
);
+   if (zstd_is_error(out_len))
return -EINVAL;
*dlen = out_len;
return 0;
@@ -182,8 +182,8 @@ static int __zstd_decompress(const u8 *src, unsigned int 
slen,
size_t out_len;
struct zstd_ctx *zctx = ctx;
 
-   out_len = ZSTD_decompressDCtx(zctx->dctx, dst, *dlen, src, slen);
-   if (ZSTD_isError(out_len))
+   out_len = zstd_decompress_dctx(zctx->dctx, dst, *dlen, src, slen);
+   if (zstd_is_error(out_len))
return -EINVAL;
*dlen = out_len;
return 0;
diff --git a/fs/btrfs/zstd.c b/fs/btrfs/zstd.c
index 8e9626d63976..14418b02c189 100644
--- a/fs/btrfs/zstd.c
+++ b/fs/btrfs/zstd.c
@@ -28,10 +28,10 @@
 /* 307s to avoid pathologically clashing with transaction commit */
 #define ZSTD_BTRFS_RECLAIM_JIFFIES (307 * HZ)
 
-static ZSTD_parameters zstd_get_btrfs_parameters(unsigned int level,
+static zstd_parameters zstd_get_btrfs_parameters(unsigned int level,
 size_t src_len)
 {
-   ZSTD_parameters params = ZSTD_getParams(level, src_len, 0);
+   zstd_parameters params = zstd_get_params(level, src_len);
 
if (params.cParams.windowLog > ZSTD_B

Re: [f2fs-dev] [PATCH v7 0/3] Update to zstd-1.4.6

2020-12-18 Thread Nick Terrell


> On Dec 16, 2020, at 5:23 PM, Michał Mirosław  wrote:
> 
> On Wed, Dec 16, 2020 at 10:07:38PM +0000, Nick Terrell wrote:
> [...]
>> It is very large. If it helps, in the commit message I’ve provided this link 
>> [0],
>> which provides the diff between upstream zstd as-is and the imported zstd,
>> which has been modified by the automated tooling to work in the kernel.
>> [0] 
>> https://github.com/terrelln/linux/commit/ac2ee65dcb7318afe426ad08f6a844faf3aebb41
> 
> I looks like you could remove a bit more dead code by noting __GNUC__ >= 4
> (gcc-4.9 is currently the oldest supported [1]).

Yeah, that would certainly be possible. My goal was to remove the most 
egregiously
irrelevant code from the kernel, in addition to unused functions which would 
generate
-Wframe-larger-than compiler warnings. My tooling doesn’t have the logic to 
reason
about >= relationships yet. If it isn’t too hard to add, I may go ahead and do 
that,
otherwise I will leave it for future work. I view that as a “nice to have” 
instead of a
hard requirement, though let me know if you disagree.

Best,
Nick

> [1] Documentation/process/changes.rst
> 
> Best Regards
> Michał Mirosław



Re: [f2fs-dev] [PATCH v7 0/3] Update to zstd-1.4.6

2020-12-16 Thread Nick Terrell


> On Dec 16, 2020, at 10:50 AM, David Sterba  wrote:
> 
> On Wed, Dec 16, 2020 at 11:58:07AM +1100, Herbert Xu wrote:
>> On Wed, Dec 16, 2020 at 12:48:51AM +0000, Nick Terrell wrote:
>>> 
>>> Thanks for the advice! The first zstd patches went through Herbert’s tree, 
>>> which is
>>> why I’ve sent them this way.
>> 
>> Sorry, but I'm not touch these patches as Christoph's objections
>> don't seem to have been addressed.
> 
> I have objections to the current patchset as well, the build bot has
> found that some of the function frames are overly large (up to 3800
> bytes) [1],

Sorry I missed your reply David, it didn’t make it to my inbox.

Compiled with x86-64, arm, and aarch64 that function does not trigger any
-Wframe-larger-than= warnings during the kernel build. It seems like the
compiler backend for the parisc architecture (the architecture that the build
bot used) is doing a particularly bad job at optimizing this function, because
there is nothing in there that should be using that much stack space.

I have a test in upstream zstd that measures the stack high water mark for
all usage of zstd compression currently in-use the kernel. It says that zstd
uses 2KB of stack space in total on x86-64. I used this test to remove 1KB of
stack usage from upstream zstd. But, this is still 400 bytes more than the
current version of zstd in the kernel. I will look into squeezing out those last
400 bytes of stack usage.

> besides the original complaint that the patch 3/3 is 1.5MiB.
> 
> [1] https://lore.kernel.org/lkml/20201204140314.gs6...@twin.jikos.cz/

It is very large. If it helps, in the commit message I’ve provided this link 
[0],
which provides the diff between upstream zstd as-is and the imported zstd,
which has been modified by the automated tooling to work in the kernel.

[0] 
https://github.com/terrelln/linux/commit/ac2ee65dcb7318afe426ad08f6a844faf3aebb41

Best,
Nick

Re: [f2fs-dev] [PATCH v7 0/3] Update to zstd-1.4.6

2020-12-15 Thread Nick Terrell


> On Dec 15, 2020, at 4:58 PM, Herbert Xu  wrote:
> 
> On Wed, Dec 16, 2020 at 12:48:51AM +0000, Nick Terrell wrote:
>> 
>> Thanks for the advice! The first zstd patches went through Herbert’s tree, 
>> which is
>> why I’ve sent them this way.
> 
> Sorry, but I'm not touch these patches as Christoph's objections
> don't seem to have been addressed.

I believe I’ve addressed Christoph's objections. He suggested creating
a wrapper API to avoid changing callers upon the zstd update. I’ve done
that, the only difference between the current API, and the changes I’ve
proposed patch 1, is that I’ve changed the prefix from ZSTD_ to zstd_ to
avoid conflicts & confusion with the upstream zstd API.

Cristoph, if you get a chance to take a look at these patches, please let
me know what you think about the current iteration of patches, and if I’ve
addressed all of your concerns.

Best,
Nick

Re: [f2fs-dev] [PATCH v7 0/3] Update to zstd-1.4.6

2020-12-15 Thread Nick Terrell


> On Dec 15, 2020, at 4:00 PM, Eric Biggers  wrote:
> 
> On Tue, Dec 15, 2020 at 08:58:52PM +0000, Nick Terrell via Linux-f2fs-devel 
> wrote:
>> 
>> 
>>> On Dec 3, 2020, at 12:51 PM, Nick Terrell  wrote:
>>> 
>>> From: Nick Terrell 
>>> 
>>> Please pull from
>>> 
>>> g...@github.com:terrelln/linux.git tags/v7-zstd-1.4.6
>>> 
>>> to get these changes. Alternatively the patchset is included.
>> 
>> Is it possible to get this patchset merged in the 5.11 merge window? It 
>> applies
>> cleanly to the latest master. Please let me know if there is anything that I 
>> can do
>> to drive this patchset towards merge.
>> 
>> Thanks,
>> Nick
> 
> Well, it's too late for 5.11 for patches that weren't already in linux-next, 
> so
> you'll have to aim for 5.12.
> 
> It looks like you're asking Herbert to pull this into the crypto tree?  If 
> he's
> interested in doing that, that could work.  However lib/zstd/ isn't that
> strongly "crypto-related", and it doesn't actually have a maintainer listed in
> MAINTAINERS.  Perhaps another path forwards is for you to volunteer to 
> maintain
> lib/zstd/ and send pull requests for it directly to Linus?

Thanks for the advice! The first zstd patches went through Herbert’s tree, 
which is
why I’ve sent them this way.

I’d be happy to maintain zstd (and lz4, and xxhash), though I don’t know 
exactly what
that entails.

Best,
Nick

> - Eric



Re: [PATCH v7 0/3] Update to zstd-1.4.6

2020-12-15 Thread Nick Terrell



> On Dec 3, 2020, at 12:51 PM, Nick Terrell  wrote:
> 
> From: Nick Terrell 
> 
> Please pull from
> 
>  g...@github.com:terrelln/linux.git tags/v7-zstd-1.4.6
> 
> to get these changes. Alternatively the patchset is included.

Is it possible to get this patchset merged in the 5.11 merge window? It applies
cleanly to the latest master. Please let me know if there is anything that I 
can do
to drive this patchset towards merge.

Thanks,
Nick

> This patchset upgrades the zstd library to the latest upstream release. The
> current zstd version in the kernel is a modified version of upstream 
> zstd-1.3.1.
> At the time it was integrated, zstd wasn't ready to be used in the kernel 
> as-is.
> But, it is now possible to use upstream zstd directly in the kernel.
> 
> I have not yet released zstd-1.4.6 upstream. I want the zstd version in the
> kernel to match up with a known upstream release, so we know exactly what code
> is running. Whenever this patchset is ready for merge, I will cut a release at
> the upstream commit that gets merged. This should not be necessary for future
> releases.
> 
> The kernel zstd library is automatically generated from upstream zstd. A 
> script
> makes the necessary changes and imports it into the kernel. The changes are:
> 
> 1. Replace all libc dependencies with kernel replacements and rewrite 
> includes.
> 2. Remove unncessary portability macros like: #if defined(_MSC_VER).
> 3. Use the kernel xxhash instead of bundling it.
> 
> This automation gets tested every commit by upstream's continuous integration.
> When we cut a new zstd release, we will submit a patch to the kernel to update
> the zstd version in the kernel.
> 
> I've updated zstd to upstream with one big patch because every commit must 
> build,
> so that precludes partial updates. Since the commit is 100% generated, I hope 
> the
> review burden is lightened. I considered replaying upstream commits, but that 
> is
> not possible because there have been ~3500 upstream commits since the last 
> zstd
> import, and the commits don't all build individually. The bulk update 
> preserves
> bisectablity because bugs can be bisected to the zstd version update. At that
> point the update can be reverted, and we can work with upstream to find and 
> fix
> the bug. After this big switch in how the kernel consumes zstd, future patches
> will be smaller, because they will only have one upstream release worth of
> changes each.
> 
> This patchset adds a new kernel-style wrapper around zstd. This wrapper API is
> functionally equivalent to the subset of the current zstd API that is 
> currently
> used. The wrapper API changes to be kernel style so that the symbols don't
> collide with zstd's symbols. The update to zstd-1.4.6 maintains the same API
> and preserves the semantics, so that none of the callers need to be updated.
> 
> This patchset comes in 2 parts:
> 1. The first 2 patches prepare for the zstd upgrade. The first patch adds the
>   new kernel style API so zstd can be upgraded without modifying any callers.
>   The second patch adds an indirection for the lib/decompress_unzstd.c
>   including of all decompression source files.
> 2. Import zstd-1.4.6. This patch is completely generated from upstream using
>   automated tooling.
> 
> I tested every caller of zstd on x86_64. I tested both after the 1.4.6 upgrade
> using the compatibility wrapper, and after the final patch in this series. 
> 
> I tested kernel and initramfs decompression in i386 and arm.
> 
> I ran benchmarks to compare the current zstd in the kernel with zstd-1.4.6.
> I benchmarked on x86_64 using QEMU with KVM enabled on an Intel i9-9900k.
> I found:
> * BtrFS zstd compression at levels 1 and 3 is 5% faster
> * BtrFS zstd decompression+read is 15% faster
> * SquashFS zstd decompression+read is 15% faster
> * F2FS zstd compression+write at level 3 is 8% faster
> * F2FS zstd decompression+read is 20% faster
> * ZRAM decompression+read is 30% faster
> * Kernel zstd decompression is 35% faster
> * Initramfs zstd decompression+build is 5% faster
> 
> The latest zstd also offers bug fixes and a 1 KB reduction in stack uage 
> during
> compression. For example the recent problem with large kernel decompression 
> has
> been fixed upstream for over 2 years https://lkml.org/lkml/2020/9/29/27.
> 
> Please let me know if there is anything that I can do to ease the way for 
> these
> patches. I think it is important because it gets large performance 
> improvements,
> contains bug fixes, and is switching to a more maintainable model of consuming
> upstream zstd directly, making it easy to keep up to date.
> 
> Best,
> Nick Terrell
> 
> v1 -> v2:
> * S

Re: [PATCH v6 1/3] lib: zstd: Add kernel-specific API

2020-12-03 Thread Nick Terrell


> On Dec 2, 2020, at 9:03 PM, Michał Mirosław  wrote:
> 
> On Thu, Dec 03, 2020 at 03:59:21AM +0000, Nick Terrell wrote:
>> On Dec 2, 2020, at 7:14 PM, Michał Mirosław  wrote:
>>> On Thu, Dec 03, 2020 at 01:42:03AM +, Nick Terrell wrote:
>>>> On Dec 2, 2020, at 5:16 PM, Michał Mirosław  
>>>> wrote:
>>>>> On Wed, Dec 02, 2020 at 12:32:40PM -0800, Nick Terrell wrote:
>>>>>> From: Nick Terrell 
>>>>>> 
>>>>>> This patch:
>>>>>> - Moves `include/linux/zstd.h` -> `lib/zstd/zstd.h`
>>>>>> - Adds a new API in `include/linux/zstd.h` that is functionally
>>>>>> equivalent to the in-use subset of the current API. Functions are
>>>>>> renamed to avoid symbol collisions with zstd, to make it clear it is
>>>>>> not the upstream zstd API, and to follow the kernel style guide.
>>>>>> - Updates all callers to use the new API.
>>>>>> 
>>>>>> There are no functional changes in this patch. Since there are no
>>>>>> functional change, I felt it was okay to update all the callers in a
>>>>>> single patch, since once the API is approved, the callers are
>>>>>> mechanically changed.
>>>>> [...]
>>>>>> --- a/lib/decompress_unzstd.c
>>>>>> +++ b/lib/decompress_unzstd.c
>>>>> [...]
>>>>>> static int INIT handle_zstd_error(size_t ret, void (*error)(char *x))
>>>>>> {
>>>>>> -const int err = ZSTD_getErrorCode(ret);
>>>>>> -
>>>>>> -if (!ZSTD_isError(ret))
>>>>>> +if (!zstd_is_error(ret))
>>>>>>  return 0;
>>>>>> 
>>>>>> -switch (err) {
>>>>>> -case ZSTD_error_memory_allocation:
>>>>>> -error("ZSTD decompressor ran out of memory");
>>>>>> -break;
>>>>>> -case ZSTD_error_prefix_unknown:
>>>>>> -error("Input is not in the ZSTD format (wrong magic 
>>>>>> bytes)");
>>>>>> -break;
>>>>>> -case ZSTD_error_dstSize_tooSmall:
>>>>>> -case ZSTD_error_corruption_detected:
>>>>>> -case ZSTD_error_checksum_wrong:
>>>>>> -error("ZSTD-compressed data is corrupt");
>>>>>> -break;
>>>>>> -default:
>>>>>> -error("ZSTD-compressed data is probably corrupt");
>>>>>> -break;
>>>>>> -}
>>>>>> +error("ZSTD decompression failed");
>>>>>>  return -1;
>>>>>> }
>>>>> 
>>>>> This looses diagnostics specificity - is this intended? At least the
>>>>> out-of-memory condition seems useful to distinguish.
>>>> 
>>>> Good point. The zstd API no longer exposes the error code enum,
>>>> but it does expose zstd_get_error_name() which can be used here.
>>>> I was thinking that the string needed to be static for some reason, but
>>>> that is not the case. I will make that change.
>>>> 
>>>>>> +size_t zstd_compress_stream(zstd_cstream *cstream,
>>>>>> +struct zstd_out_buffer *output, struct zstd_in_buffer *input)
>>>>>> +{
>>>>>> +ZSTD_outBuffer o;
>>>>>> +ZSTD_inBuffer i;
>>>>>> +size_t ret;
>>>>>> +
>>>>>> +memcpy(, output, sizeof(o));
>>>>>> +memcpy(, input, sizeof(i));
>>>>>> +ret = ZSTD_compressStream(cstream, , );
>>>>>> +memcpy(output, , sizeof(o));
>>>>>> +memcpy(input, , sizeof(i));
>>>>>> +return ret;
>>>>>> +}
>>>>> 
>>>>> Is all this copying necessary? How is it different from type-punning by
>>>>> direct pointer cast?
>>>> 
>>>> If breaking strict aliasing and type-punning by pointer casing is okay, 
>>>> then
>>>> we can do that here. These memcpys will be negligible for performance, but
>>>> type-punning would be more succinct if allowed.
>>> 
>>> Ah, this might break LTO builds due to strict aliasing violation.
>>> So I would suggest to just #define the ZSTD names to kernel ones
>>> for the library code.  Unless there is a cleaner solution...
>> 
>> I don’t want to do that because I want in the 3rd series of the patchset I 
>> update
>> to zstd-1.4.6. And I’m using zstd-1.4.6 as-is in upstream. This allows us to 
>> keep
>> the kernel version up to date, since the patch to update to a new version 
>> can be
>> generated automatically (and manually tested), so it doesn’t fall years 
>> behind
>> upstream again.
>> 
>> The alternative would be to make upstream zstd’s header public and
>> #define zstd_in_buffer ZSTD_inBuffer. But that would make zstd’s header
>> public, which would somewhat defeat the purpose of having a kernel wrapper.
> 
> I thought the problem was API style spill-over from the library to other parts
> of the kernel.  A header-only wrapper can stop this.  I'm not sure symbol
> visibility (namespace pollution) was a concern.

Thanks for the review Michał! I have just submitted a new version of the patches
with the suggested changes!

Best,
Nick Terrell

> Best Regards
> Michał Mirosław



[PATCH v7 2/3] lib: zstd: Add decompress_sources.h for decompress_unzstd

2020-12-03 Thread Nick Terrell
From: Nick Terrell 

Adds decompress_sources.h which includes every .c file necessary for
zstd decompression. This is used in decompress_unzstd.c so the internal
structure of the library isn't exposed.

This allows us to upgrade the zstd library version without modifying any
callers. Instead we just need to update decompress_sources.h.

Signed-off-by: Nick Terrell 
---
 lib/decompress_unzstd.c   |  6 +-
 lib/zstd/decompress_sources.h | 14 ++
 2 files changed, 15 insertions(+), 5 deletions(-)
 create mode 100644 lib/zstd/decompress_sources.h

diff --git a/lib/decompress_unzstd.c b/lib/decompress_unzstd.c
index dab2d55cf08d..e6897a5063a7 100644
--- a/lib/decompress_unzstd.c
+++ b/lib/decompress_unzstd.c
@@ -68,11 +68,7 @@
 #ifdef STATIC
 # define UNZSTD_PREBOOT
 # include "xxhash.c"
-# include "zstd/entropy_common.c"
-# include "zstd/fse_decompress.c"
-# include "zstd/huf_decompress.c"
-# include "zstd/zstd_common.c"
-# include "zstd/decompress.c"
+# include "zstd/decompress_sources.h"
 #endif
 
 #include 
diff --git a/lib/zstd/decompress_sources.h b/lib/zstd/decompress_sources.h
new file mode 100644
index ..d2fe10af0043
--- /dev/null
+++ b/lib/zstd/decompress_sources.h
@@ -0,0 +1,14 @@
+/* SPDX-License-Identifier: GPL-2.0-only */
+
+/*
+ * This file includes every .c file needed for decompression.
+ * It is used by lib/decompress_unzstd.c to include the decompression
+ * source into the translation-unit, so it can be used for kernel
+ * decompression.
+ */
+
+#include "entropy_common.c"
+#include "fse_decompress.c"
+#include "huf_decompress.c"
+#include "zstd_common.c"
+#include "decompress.c"
-- 
2.29.2



[PATCH v7 1/3] lib: zstd: Add kernel-specific API

2020-12-03 Thread Nick Terrell
From: Nick Terrell 

This patch:
- Moves `include/linux/zstd.h` -> `include/linux/zstd_lib.h`
- Adds a new API in `include/linux/zstd.h` that is functionally
  equivalent to the in-use subset of the current API. Functions are
  renamed to avoid symbol collisions with zstd, to make it clear it is
  not the upstream zstd API, and to follow the kernel style guide.
- Updates all callers to use the new API.

There are no functional changes in this patch. Since there are no
functional change, I felt it was okay to update all the callers in a
single patch, since once the API is approved, the callers are
mechanically changed.

This patch is preparing the next patch in the series, which updates
zstd to version 1.4.6. Since the upstream zstd API is no longer exposed
to callers, the update can happen transparently.

Signed-off-by: Nick Terrell 
---
 crypto/zstd.c  |   28 +-
 fs/btrfs/zstd.c|   68 +-
 fs/f2fs/compress.c |   56 +-
 fs/pstore/platform.c   |2 +-
 fs/squashfs/zstd_wrapper.c |   16 +-
 include/linux/zstd.h   | 1218 
 include/linux/zstd_lib.h   | 1157 ++
 lib/decompress_unzstd.c|   42 +-
 lib/zstd/compress.c|  107 ++--
 lib/zstd/decompress.c  |  112 ++--
 10 files changed, 1657 insertions(+), 1149 deletions(-)
 create mode 100644 include/linux/zstd_lib.h

diff --git a/crypto/zstd.c b/crypto/zstd.c
index 1a3309f066f7..154a969c83a8 100644
--- a/crypto/zstd.c
+++ b/crypto/zstd.c
@@ -18,22 +18,22 @@
 #define ZSTD_DEF_LEVEL 3
 
 struct zstd_ctx {
-   ZSTD_CCtx *cctx;
-   ZSTD_DCtx *dctx;
+   zstd_cctx *cctx;
+   zstd_dctx *dctx;
void *cwksp;
void *dwksp;
 };
 
-static ZSTD_parameters zstd_params(void)
+static zstd_parameters zstd_params(void)
 {
-   return ZSTD_getParams(ZSTD_DEF_LEVEL, 0, 0);
+   return zstd_get_params(ZSTD_DEF_LEVEL, 0);
 }
 
 static int zstd_comp_init(struct zstd_ctx *ctx)
 {
int ret = 0;
-   const ZSTD_parameters params = zstd_params();
-   const size_t wksp_size = ZSTD_CCtxWorkspaceBound(params.cParams);
+   const zstd_parameters params = zstd_params();
+   const size_t wksp_size = zstd_cctx_workspace_bound();
 
ctx->cwksp = vzalloc(wksp_size);
if (!ctx->cwksp) {
@@ -41,7 +41,7 @@ static int zstd_comp_init(struct zstd_ctx *ctx)
goto out;
}
 
-   ctx->cctx = ZSTD_initCCtx(ctx->cwksp, wksp_size);
+   ctx->cctx = zstd_init_cctx(ctx->cwksp, wksp_size);
if (!ctx->cctx) {
ret = -EINVAL;
goto out_free;
@@ -56,7 +56,7 @@ static int zstd_comp_init(struct zstd_ctx *ctx)
 static int zstd_decomp_init(struct zstd_ctx *ctx)
 {
int ret = 0;
-   const size_t wksp_size = ZSTD_DCtxWorkspaceBound();
+   const size_t wksp_size = zstd_dctx_workspace_bound();
 
ctx->dwksp = vzalloc(wksp_size);
if (!ctx->dwksp) {
@@ -64,7 +64,7 @@ static int zstd_decomp_init(struct zstd_ctx *ctx)
goto out;
}
 
-   ctx->dctx = ZSTD_initDCtx(ctx->dwksp, wksp_size);
+   ctx->dctx = zstd_init_dctx(ctx->dwksp, wksp_size);
if (!ctx->dctx) {
ret = -EINVAL;
goto out_free;
@@ -152,10 +152,10 @@ static int __zstd_compress(const u8 *src, unsigned int 
slen,
 {
size_t out_len;
struct zstd_ctx *zctx = ctx;
-   const ZSTD_parameters params = zstd_params();
+   const zstd_parameters params = zstd_params();
 
-   out_len = ZSTD_compressCCtx(zctx->cctx, dst, *dlen, src, slen, params);
-   if (ZSTD_isError(out_len))
+   out_len = zstd_compress_cctx(zctx->cctx, dst, *dlen, src, slen, 
);
+   if (zstd_is_error(out_len))
return -EINVAL;
*dlen = out_len;
return 0;
@@ -182,8 +182,8 @@ static int __zstd_decompress(const u8 *src, unsigned int 
slen,
size_t out_len;
struct zstd_ctx *zctx = ctx;
 
-   out_len = ZSTD_decompressDCtx(zctx->dctx, dst, *dlen, src, slen);
-   if (ZSTD_isError(out_len))
+   out_len = zstd_decompress_dctx(zctx->dctx, dst, *dlen, src, slen);
+   if (zstd_is_error(out_len))
return -EINVAL;
*dlen = out_len;
return 0;
diff --git a/fs/btrfs/zstd.c b/fs/btrfs/zstd.c
index 9a4871636c6c..c8cf690013f3 100644
--- a/fs/btrfs/zstd.c
+++ b/fs/btrfs/zstd.c
@@ -28,10 +28,10 @@
 /* 307s to avoid pathologically clashing with transaction commit */
 #define ZSTD_BTRFS_RECLAIM_JIFFIES (307 * HZ)
 
-static ZSTD_parameters zstd_get_btrfs_parameters(unsigned int level,
+static zstd_parameters zstd_get_btrfs_parameters(unsigned int level,
 size_t src_len)
 {
-   ZSTD_parameters params = ZSTD_getParams(level, src_len, 0);
+   zstd_parameters params = zstd_get_params(level, src_len);
 
if (params.cParams.windowLog > ZSTD_B

[PATCH v7 0/3] Update to zstd-1.4.6

2020-12-03 Thread Nick Terrell
From: Nick Terrell 

Please pull from

  g...@github.com:terrelln/linux.git tags/v7-zstd-1.4.6

to get these changes. Alternatively the patchset is included.

This patchset upgrades the zstd library to the latest upstream release. The
current zstd version in the kernel is a modified version of upstream zstd-1.3.1.
At the time it was integrated, zstd wasn't ready to be used in the kernel as-is.
But, it is now possible to use upstream zstd directly in the kernel.

I have not yet released zstd-1.4.6 upstream. I want the zstd version in the
kernel to match up with a known upstream release, so we know exactly what code
is running. Whenever this patchset is ready for merge, I will cut a release at
the upstream commit that gets merged. This should not be necessary for future
releases.

The kernel zstd library is automatically generated from upstream zstd. A script
makes the necessary changes and imports it into the kernel. The changes are:

1. Replace all libc dependencies with kernel replacements and rewrite includes.
2. Remove unncessary portability macros like: #if defined(_MSC_VER).
3. Use the kernel xxhash instead of bundling it.

This automation gets tested every commit by upstream's continuous integration.
When we cut a new zstd release, we will submit a patch to the kernel to update
the zstd version in the kernel.

I've updated zstd to upstream with one big patch because every commit must 
build,
so that precludes partial updates. Since the commit is 100% generated, I hope 
the
review burden is lightened. I considered replaying upstream commits, but that is
not possible because there have been ~3500 upstream commits since the last zstd
import, and the commits don't all build individually. The bulk update preserves
bisectablity because bugs can be bisected to the zstd version update. At that
point the update can be reverted, and we can work with upstream to find and fix
the bug. After this big switch in how the kernel consumes zstd, future patches
will be smaller, because they will only have one upstream release worth of
changes each.

This patchset adds a new kernel-style wrapper around zstd. This wrapper API is
functionally equivalent to the subset of the current zstd API that is currently
used. The wrapper API changes to be kernel style so that the symbols don't
collide with zstd's symbols. The update to zstd-1.4.6 maintains the same API
and preserves the semantics, so that none of the callers need to be updated.

This patchset comes in 2 parts:
1. The first 2 patches prepare for the zstd upgrade. The first patch adds the
   new kernel style API so zstd can be upgraded without modifying any callers.
   The second patch adds an indirection for the lib/decompress_unzstd.c
   including of all decompression source files.
2. Import zstd-1.4.6. This patch is completely generated from upstream using
   automated tooling.

I tested every caller of zstd on x86_64. I tested both after the 1.4.6 upgrade
using the compatibility wrapper, and after the final patch in this series. 

I tested kernel and initramfs decompression in i386 and arm.

I ran benchmarks to compare the current zstd in the kernel with zstd-1.4.6.
I benchmarked on x86_64 using QEMU with KVM enabled on an Intel i9-9900k.
I found:
* BtrFS zstd compression at levels 1 and 3 is 5% faster
* BtrFS zstd decompression+read is 15% faster
* SquashFS zstd decompression+read is 15% faster
* F2FS zstd compression+write at level 3 is 8% faster
* F2FS zstd decompression+read is 20% faster
* ZRAM decompression+read is 30% faster
* Kernel zstd decompression is 35% faster
* Initramfs zstd decompression+build is 5% faster

The latest zstd also offers bug fixes and a 1 KB reduction in stack uage during
compression. For example the recent problem with large kernel decompression has
been fixed upstream for over 2 years https://lkml.org/lkml/2020/9/29/27.

Please let me know if there is anything that I can do to ease the way for these
patches. I think it is important because it gets large performance improvements,
contains bug fixes, and is switching to a more maintainable model of consuming
upstream zstd directly, making it easy to keep up to date.

Best,
Nick Terrell

v1 -> v2:
* Successfully tested F2FS with help from Chao Yu to fix my test.
* (1/9) Fix ZSTD_initCStream() wrapper to handle pledged_src_size=0 means 
unknown.
  This fixes F2FS with the zstd-1.4.6 compatibility wrapper, exposed by the 
test.

v2 -> v3:
* (3/9) Silence warnings by Kernel Test Robot:
  https://github.com/facebook/zstd/pull/2324
  Stack size warnings remain, but these aren't new, and the functions it warns 
on
  are either unused or not in the maximum stack path. This patchset reduces zstd
  compression stack usage by 1 KB overall. I've gotten the low hanging fruit, 
and
  more stack reduction would require significant changes that have the potential
  to introduce new bugs. However, I do hope to continue to reduce zstd stack
  usage in future versions.

v3 -> v4:
* (3/9) F

Re: [PATCH v6 1/3] lib: zstd: Add kernel-specific API

2020-12-02 Thread Nick Terrell


> On Dec 2, 2020, at 9:03 PM, Michał Mirosław  wrote:
> 
> On Thu, Dec 03, 2020 at 03:59:21AM +0000, Nick Terrell wrote:
>> On Dec 2, 2020, at 7:14 PM, Michał Mirosław  wrote:
>>> On Thu, Dec 03, 2020 at 01:42:03AM +, Nick Terrell wrote:
>>>> On Dec 2, 2020, at 5:16 PM, Michał Mirosław  
>>>> wrote:
>>>>> On Wed, Dec 02, 2020 at 12:32:40PM -0800, Nick Terrell wrote:
>>>>>> From: Nick Terrell 
>>>>>> 
>>>>>> This patch:
>>>>>> - Moves `include/linux/zstd.h` -> `lib/zstd/zstd.h`
>>>>>> - Adds a new API in `include/linux/zstd.h` that is functionally
>>>>>> equivalent to the in-use subset of the current API. Functions are
>>>>>> renamed to avoid symbol collisions with zstd, to make it clear it is
>>>>>> not the upstream zstd API, and to follow the kernel style guide.
>>>>>> - Updates all callers to use the new API.
>>>>>> 
>>>>>> There are no functional changes in this patch. Since there are no
>>>>>> functional change, I felt it was okay to update all the callers in a
>>>>>> single patch, since once the API is approved, the callers are
>>>>>> mechanically changed.
>>>>> [...]
>>>>>> --- a/lib/decompress_unzstd.c
>>>>>> +++ b/lib/decompress_unzstd.c
>>>>> [...]
>>>>>> static int INIT handle_zstd_error(size_t ret, void (*error)(char *x))
>>>>>> {
>>>>>> -const int err = ZSTD_getErrorCode(ret);
>>>>>> -
>>>>>> -if (!ZSTD_isError(ret))
>>>>>> +if (!zstd_is_error(ret))
>>>>>>  return 0;
>>>>>> 
>>>>>> -switch (err) {
>>>>>> -case ZSTD_error_memory_allocation:
>>>>>> -error("ZSTD decompressor ran out of memory");
>>>>>> -break;
>>>>>> -case ZSTD_error_prefix_unknown:
>>>>>> -error("Input is not in the ZSTD format (wrong magic 
>>>>>> bytes)");
>>>>>> -break;
>>>>>> -case ZSTD_error_dstSize_tooSmall:
>>>>>> -case ZSTD_error_corruption_detected:
>>>>>> -case ZSTD_error_checksum_wrong:
>>>>>> -error("ZSTD-compressed data is corrupt");
>>>>>> -break;
>>>>>> -default:
>>>>>> -error("ZSTD-compressed data is probably corrupt");
>>>>>> -break;
>>>>>> -}
>>>>>> +error("ZSTD decompression failed");
>>>>>>  return -1;
>>>>>> }
>>>>> 
>>>>> This looses diagnostics specificity - is this intended? At least the
>>>>> out-of-memory condition seems useful to distinguish.
>>>> 
>>>> Good point. The zstd API no longer exposes the error code enum,
>>>> but it does expose zstd_get_error_name() which can be used here.
>>>> I was thinking that the string needed to be static for some reason, but
>>>> that is not the case. I will make that change.
>>>> 
>>>>>> +size_t zstd_compress_stream(zstd_cstream *cstream,
>>>>>> +struct zstd_out_buffer *output, struct zstd_in_buffer *input)
>>>>>> +{
>>>>>> +ZSTD_outBuffer o;
>>>>>> +ZSTD_inBuffer i;
>>>>>> +size_t ret;
>>>>>> +
>>>>>> +memcpy(, output, sizeof(o));
>>>>>> +memcpy(, input, sizeof(i));
>>>>>> +ret = ZSTD_compressStream(cstream, , );
>>>>>> +memcpy(output, , sizeof(o));
>>>>>> +memcpy(input, , sizeof(i));
>>>>>> +return ret;
>>>>>> +}
>>>>> 
>>>>> Is all this copying necessary? How is it different from type-punning by
>>>>> direct pointer cast?
>>>> 
>>>> If breaking strict aliasing and type-punning by pointer casing is okay, 
>>>> then
>>>> we can do that here. These memcpys will be negligible for performance, but
>>>> type-punning would be more succinct if allowed.
>>> 
>>> Ah, this might break LTO builds due to strict aliasing violation.
>>> So I would suggest to just #define the ZSTD names to kernel ones
>>> for the library code.  Unless there is a cleaner solution...
>> 
>> I don’t want to do that because I want in the 3rd series of the patchset I 
>> update
>> to zstd-1.4.6. And I’m using zstd-1.4.6 as-is in upstream. This allows us to 
>> keep
>> the kernel version up to date, since the patch to update to a new version 
>> can be
>> generated automatically (and manually tested), so it doesn’t fall years 
>> behind
>> upstream again.
>> 
>> The alternative would be to make upstream zstd’s header public and
>> #define zstd_in_buffer ZSTD_inBuffer. But that would make zstd’s header
>> public, which would somewhat defeat the purpose of having a kernel wrapper.
> 
> I thought the problem was API style spill-over from the library to other parts
> of the kernel.  A header-only wrapper can stop this.  I'm not sure symbol
> visibility (namespace pollution) was a concern.

Thats true. It seems slightly unclean, but so Is duplicating these structs and 
memcpying
them. So I’ll go ahead and expose the upstream zstd’s header (“lib/zstd/zstd.h” 
here).
I’ll just need to pick a name for the upstream “zstd.h” header.

> Best Regards
> Michał Mirosław



Re: [PATCH v6 1/3] lib: zstd: Add kernel-specific API

2020-12-02 Thread Nick Terrell


> On Dec 2, 2020, at 7:14 PM, Michał Mirosław  wrote:
> 
> On Thu, Dec 03, 2020 at 01:42:03AM +0000, Nick Terrell wrote:
>> 
>> 
>>> On Dec 2, 2020, at 5:16 PM, Michał Mirosław  wrote:
>>> 
>>> On Wed, Dec 02, 2020 at 12:32:40PM -0800, Nick Terrell wrote:
>>>> From: Nick Terrell 
>>>> 
>>>> This patch:
>>>> - Moves `include/linux/zstd.h` -> `lib/zstd/zstd.h`
>>>> - Adds a new API in `include/linux/zstd.h` that is functionally
>>>> equivalent to the in-use subset of the current API. Functions are
>>>> renamed to avoid symbol collisions with zstd, to make it clear it is
>>>> not the upstream zstd API, and to follow the kernel style guide.
>>>> - Updates all callers to use the new API.
>>>> 
>>>> There are no functional changes in this patch. Since there are no
>>>> functional change, I felt it was okay to update all the callers in a
>>>> single patch, since once the API is approved, the callers are
>>>> mechanically changed.
>>> [...]
>>>> --- a/lib/decompress_unzstd.c
>>>> +++ b/lib/decompress_unzstd.c
>>> [...]
>>>> static int INIT handle_zstd_error(size_t ret, void (*error)(char *x))
>>>> {
>>>> -  const int err = ZSTD_getErrorCode(ret);
>>>> -
>>>> -  if (!ZSTD_isError(ret))
>>>> +  if (!zstd_is_error(ret))
>>>>return 0;
>>>> 
>>>> -  switch (err) {
>>>> -  case ZSTD_error_memory_allocation:
>>>> -  error("ZSTD decompressor ran out of memory");
>>>> -  break;
>>>> -  case ZSTD_error_prefix_unknown:
>>>> -  error("Input is not in the ZSTD format (wrong magic bytes)");
>>>> -  break;
>>>> -  case ZSTD_error_dstSize_tooSmall:
>>>> -  case ZSTD_error_corruption_detected:
>>>> -  case ZSTD_error_checksum_wrong:
>>>> -  error("ZSTD-compressed data is corrupt");
>>>> -  break;
>>>> -  default:
>>>> -  error("ZSTD-compressed data is probably corrupt");
>>>> -  break;
>>>> -  }
>>>> +  error("ZSTD decompression failed");
>>>>return -1;
>>>> }
>>> 
>>> This looses diagnostics specificity - is this intended? At least the
>>> out-of-memory condition seems useful to distinguish.
>> 
>> Good point. The zstd API no longer exposes the error code enum,
>> but it does expose zstd_get_error_name() which can be used here.
>> I was thinking that the string needed to be static for some reason, but
>> that is not the case. I will make that change.
>> 
>>>> +size_t zstd_compress_stream(zstd_cstream *cstream,
>>>> +  struct zstd_out_buffer *output, struct zstd_in_buffer *input)
>>>> +{
>>>> +  ZSTD_outBuffer o;
>>>> +  ZSTD_inBuffer i;
>>>> +  size_t ret;
>>>> +
>>>> +  memcpy(, output, sizeof(o));
>>>> +  memcpy(, input, sizeof(i));
>>>> +  ret = ZSTD_compressStream(cstream, , );
>>>> +  memcpy(output, , sizeof(o));
>>>> +  memcpy(input, , sizeof(i));
>>>> +  return ret;
>>>> +}
>>> 
>>> Is all this copying necessary? How is it different from type-punning by
>>> direct pointer cast?
>> 
>> If breaking strict aliasing and type-punning by pointer casing is okay, then
>> we can do that here. These memcpys will be negligible for performance, but
>> type-punning would be more succinct if allowed.
> 
> Ah, this might break LTO builds due to strict aliasing violation.
> So I would suggest to just #define the ZSTD names to kernel ones
> for the library code.  Unless there is a cleaner solution...

I don’t want to do that because I want in the 3rd series of the patchset I 
update
to zstd-1.4.6. And I’m using zstd-1.4.6 as-is in upstream. This allows us to 
keep
the kernel version up to date, since the patch to update to a new version can be
generated automatically (and manually tested), so it doesn’t fall years behind
upstream again.

The alternative would be to make upstream zstd’s header public and
#define zstd_in_buffer ZSTD_inBuffer. But that would make zstd’s header
public, which would somewhat defeat the purpose of having a kernel wrapper.

These memcpy’s won’t hurt performance, since this function is called at most
every 4KB of input data in all the callers, though they are clunky.



Re: [PATCH v6 1/3] lib: zstd: Add kernel-specific API

2020-12-02 Thread Nick Terrell


> On Dec 2, 2020, at 5:16 PM, Michał Mirosław  wrote:
> 
> On Wed, Dec 02, 2020 at 12:32:40PM -0800, Nick Terrell wrote:
>> From: Nick Terrell 
>> 
>> This patch:
>> - Moves `include/linux/zstd.h` -> `lib/zstd/zstd.h`
>> - Adds a new API in `include/linux/zstd.h` that is functionally
>>  equivalent to the in-use subset of the current API. Functions are
>>  renamed to avoid symbol collisions with zstd, to make it clear it is
>>  not the upstream zstd API, and to follow the kernel style guide.
>> - Updates all callers to use the new API.
>> 
>> There are no functional changes in this patch. Since there are no
>> functional change, I felt it was okay to update all the callers in a
>> single patch, since once the API is approved, the callers are
>> mechanically changed.
> [...]
>> --- a/lib/decompress_unzstd.c
>> +++ b/lib/decompress_unzstd.c
> [...]
>> static int INIT handle_zstd_error(size_t ret, void (*error)(char *x))
>> {
>> -const int err = ZSTD_getErrorCode(ret);
>> -
>> -if (!ZSTD_isError(ret))
>> +if (!zstd_is_error(ret))
>>  return 0;
>> 
>> -switch (err) {
>> -case ZSTD_error_memory_allocation:
>> -error("ZSTD decompressor ran out of memory");
>> -break;
>> -case ZSTD_error_prefix_unknown:
>> -error("Input is not in the ZSTD format (wrong magic bytes)");
>> -break;
>> -case ZSTD_error_dstSize_tooSmall:
>> -case ZSTD_error_corruption_detected:
>> -case ZSTD_error_checksum_wrong:
>> -error("ZSTD-compressed data is corrupt");
>> -break;
>> -default:
>> -error("ZSTD-compressed data is probably corrupt");
>> -break;
>> -}
>> +error("ZSTD decompression failed");
>>  return -1;
>> }
> 
> This looses diagnostics specificity - is this intended? At least the
> out-of-memory condition seems useful to distinguish.

Good point. The zstd API no longer exposes the error code enum,
but it does expose zstd_get_error_name() which can be used here.
I was thinking that the string needed to be static for some reason, but
that is not the case. I will make that change.

>> +size_t zstd_compress_stream(zstd_cstream *cstream,
>> +struct zstd_out_buffer *output, struct zstd_in_buffer *input)
>> +{
>> +ZSTD_outBuffer o;
>> +ZSTD_inBuffer i;
>> +size_t ret;
>> +
>> +memcpy(, output, sizeof(o));
>> +memcpy(, input, sizeof(i));
>> +ret = ZSTD_compressStream(cstream, , );
>> +memcpy(output, , sizeof(o));
>> +memcpy(input, , sizeof(i));
>> +return ret;
>> +}
> 
> Is all this copying necessary? How is it different from type-punning by
> direct pointer cast?

If breaking strict aliasing and type-punning by pointer casing is okay, then
we can do that here. These memcpys will be negligible for performance, but
type-punning would be more succinct if allowed.

Best,
Nick



[PATCH v6 2/3] lib: zstd: Add decompress_sources.h for decompress_unzstd

2020-12-02 Thread Nick Terrell
From: Nick Terrell 

Adds decompress_sources.h which includes every .c file necessary for
zstd decompression. This is used in decompress_unzstd.c so the internal
structure of the library isn't exposed.

This allows us to upgrade the zstd library version without modifying any
callers. Instead we just need to update decompress_sources.h.

Signed-off-by: Nick Terrell 
---
 lib/decompress_unzstd.c   |  6 +-
 lib/zstd/decompress_sources.h | 14 ++
 2 files changed, 15 insertions(+), 5 deletions(-)
 create mode 100644 lib/zstd/decompress_sources.h

diff --git a/lib/decompress_unzstd.c b/lib/decompress_unzstd.c
index 87ff567fd76d..d42281d7d416 100644
--- a/lib/decompress_unzstd.c
+++ b/lib/decompress_unzstd.c
@@ -68,11 +68,7 @@
 #ifdef STATIC
 # define UNZSTD_PREBOOT
 # include "xxhash.c"
-# include "zstd/entropy_common.c"
-# include "zstd/fse_decompress.c"
-# include "zstd/huf_decompress.c"
-# include "zstd/zstd_common.c"
-# include "zstd/decompress.c"
+# include "zstd/decompress_sources.h"
 #endif
 
 #include 
diff --git a/lib/zstd/decompress_sources.h b/lib/zstd/decompress_sources.h
new file mode 100644
index ..d2fe10af0043
--- /dev/null
+++ b/lib/zstd/decompress_sources.h
@@ -0,0 +1,14 @@
+/* SPDX-License-Identifier: GPL-2.0-only */
+
+/*
+ * This file includes every .c file needed for decompression.
+ * It is used by lib/decompress_unzstd.c to include the decompression
+ * source into the translation-unit, so it can be used for kernel
+ * decompression.
+ */
+
+#include "entropy_common.c"
+#include "fse_decompress.c"
+#include "huf_decompress.c"
+#include "zstd_common.c"
+#include "decompress.c"
-- 
2.29.2



[PATCH v6 0/3] Update to zstd-1.4.6

2020-12-02 Thread Nick Terrell
From: Nick Terrell 

Please pull from

  g...@github.com:terrelln/linux.git tags/zstd-1.4.6-v6

to get these changes. Alternatively the patchset is included.

This patchset upgrades the zstd library to the latest upstream release. The
current zstd version in the kernel is a modified version of upstream zstd-1.3.1.
At the time it was integrated, zstd wasn't ready to be used in the kernel as-is.
But, it is now possible to use upstream zstd directly in the kernel.

I have not yet released zstd-1.4.6 upstream. I want the zstd version in the
kernel to match up with a known upstream release, so we know exactly what code
is running. Whenever this patchset is ready for merge, I will cut a release at
the upstream commit that gets merged. This should not be necessary for future
releases.

The kernel zstd library is automatically generated from upstream zstd. A script
makes the necessary changes and imports it into the kernel. The changes are:

1. Replace all libc dependencies with kernel replacements and rewrite includes.
2. Remove unncessary portability macros like: #if defined(_MSC_VER).
3. Use the kernel xxhash instead of bundling it.

This automation gets tested every commit by upstream's continuous integration.
When we cut a new zstd release, we will submit a patch to the kernel to update
the zstd version in the kernel.

I've updated zstd to upstream with one big patch because every commit must 
build,
so that precludes partial updates. Since the commit is 100% generated, I hope 
the
review burden is lightened. I considered replaying upstream commits, but that is
not possible because there have been ~3500 upstream commits since the last zstd
import, and the commits don't all build individually. The bulk update preserves
bisectablity because bugs can be bisected to the zstd version update. At that
point the update can be reverted, and we can work with upstream to find and fix
the bug. After this big switch in how the kernel consumes zstd, future patches
will be smaller, because they will only have one upstream release worth of
changes each.

This patchset adds a new kernel-style wrapper around zstd. This wrapper API is
functionally equivalent to the subset of the current zstd API that is currently
used. The wrapper API changes to be kernel style so that the symbols don't
collide with zstd's symbols. The update to zstd-1.4.6 maintains the same API
and preserves the semantics, so that none of the callers need to be updated.

This patchset comes in 2 parts:
1. The first 2 patches prepare for the zstd upgrade. The first patch adds the
   new kernel style API so zstd can be upgraded without modifying any callers.
   The second patch adds an indirection for the lib/decompress_unzstd.c
   including of all decompression source files.
2. Import zstd-1.4.6. This patch is completely generated from upstream using
   automated tooling.

I tested every caller of zstd on x86_64. I tested both after the 1.4.6 upgrade
using the compatibility wrapper, and after the final patch in this series. 

I tested kernel and initramfs decompression in i386 and arm.

I ran benchmarks to compare the current zstd in the kernel with zstd-1.4.6.
I benchmarked on x86_64 using QEMU with KVM enabled on an Intel i9-9900k.
I found:
* BtrFS zstd compression at levels 1 and 3 is 5% faster
* BtrFS zstd decompression+read is 15% faster
* SquashFS zstd decompression+read is 15% faster
* F2FS zstd compression+write at level 3 is 8% faster
* F2FS zstd decompression+read is 20% faster
* ZRAM decompression+read is 30% faster
* Kernel zstd decompression is 35% faster
* Initramfs zstd decompression+build is 5% faster

The latest zstd also offers bug fixes and a 1 KB reduction in stack uage during
compression. For example the recent problem with large kernel decompression has
been fixed upstream for over 2 years https://lkml.org/lkml/2020/9/29/27.

Please let me know if there is anything that I can do to ease the way for these
patches. I think it is important because it gets large performance improvements,
contains bug fixes, and is switching to a more maintainable model of consuming
upstream zstd directly, making it easy to keep up to date.

Best,
Nick Terrell

v1 -> v2:
* Successfully tested F2FS with help from Chao Yu to fix my test.
* (1/9) Fix ZSTD_initCStream() wrapper to handle pledged_src_size=0 means 
unknown.
  This fixes F2FS with the zstd-1.4.6 compatibility wrapper, exposed by the 
test.

v2 -> v3:
* (3/9) Silence warnings by Kernel Test Robot:
  https://github.com/facebook/zstd/pull/2324
  Stack size warnings remain, but these aren't new, and the functions it warns 
on
  are either unused or not in the maximum stack path. This patchset reduces zstd
  compression stack usage by 1 KB overall. I've gotten the low hanging fruit, 
and
  more stack reduction would require significant changes that have the potential
  to introduce new bugs. However, I do hope to continue to reduce zstd stack
  usage in future versions.

v3 -> v4:
* (3/9) F

Re: [PATCH v2] lib/lz4: explicitly support in-place decompression

2020-11-30 Thread Nick Terrell



> On Nov 21, 2020, at 7:07 PM, Gao Xiang  wrote:
> 
> LZ4 final literal copy could be overlapped when doing
> in-place decompression, so it's unsafe to just use memcpy()
> on an optimized memcpy approach but memmove() instead.
> 
> Upstream LZ4 has updated this years ago [1] (and the impact
> is non-sensible [2] plus only a few bytes remain), this commit
> just synchronizes LZ4 upstream code to the kernel side as well.
> 
> It can be observed as EROFS in-place decompression failure
> on specific files when X86_FEATURE_ERMS is unsupported,
> memcpy() optimization of commit 59daa706fbec ("x86, mem:
> Optimize memcpy by avoiding memory false dependece") will
> be enabled then.
> 
> Currently most modern x86-CPUs support ERMS, these CPUs just
> use "rep movsb" approach so no problem at all. However, it can
> still be verified with forcely disabling ERMS feature...
> 
> arch/x86/lib/memcpy_64.S:
>ALTERNATIVE_2 "jmp memcpy_orig", "", X86_FEATURE_REP_GOOD, \
> - "jmp memcpy_erms", X86_FEATURE_ERMS
> + "jmp memcpy_orig", X86_FEATURE_ERMS
> 
> We didn't observe any strange on arm64/arm/x86 platform before
> since most memcpy() would behave in an increasing address order
> ("copy upwards" [3]) and it's the correct order of in-place
> decompression but it really needs an update to memmove() for sure
> considering it's an undefined behavior according to the standard
> and some unique optimization already exists in the kernel.
> 
> [1] https://github.com/lz4/lz4/commit/33cb8518ac385835cc17be9a770b27b40cd0e15b
> [2] https://github.com/lz4/lz4/pull/717#issuecomment-497818921
> [3] https://sourceware.org/bugzilla/show_bug.cgi?id=12518 
> Cc: Yann Collet 
> Cc: Nick Terrell 
> Cc: Miao Xie 
> Cc: Chao Yu 
> Cc: Li Guifu 
> Cc: Guo Xuenan 
> Signed-off-by: Gao Xiang 
> ---
> changes since v1:
> - refine commit message;
> - Cc more people.
> 
> Hi Andrew,
> 
> Could you kindly consider picking this patch up, although
> the impact is EROFS but it touchs in-kernel lz4 library
> anyway...
> 
> Thanks,
> Gao Xiang
> 
> lib/lz4/lz4_decompress.c | 6 +-
> lib/lz4/lz4defs.h| 1 +
> 2 files changed, 6 insertions(+), 1 deletion(-)
> 
> diff --git a/lib/lz4/lz4_decompress.c b/lib/lz4/lz4_decompress.c
> index 00cb0d0b73e1..8a7724a6ce2f 100644
> --- a/lib/lz4/lz4_decompress.c
> +++ b/lib/lz4/lz4_decompress.c
> @@ -263,7 +263,11 @@ static FORCE_INLINE int LZ4_decompress_generic(
>   }
>   }
> 
> - LZ4_memcpy(op, ip, length);
> + /*
> +  * supports overlapping memory regions; only matters
> +  * for in-place decompression scenarios
> +  */
> + LZ4_memmove(op, ip, length);
>   ip += length;
>   op += length;
> 
> diff --git a/lib/lz4/lz4defs.h b/lib/lz4/lz4defs.h
> index c91dd96ef629..673bd206aa98 100644
> --- a/lib/lz4/lz4defs.h
> +++ b/lib/lz4/lz4defs.h
> @@ -146,6 +146,7 @@ static FORCE_INLINE void LZ4_writeLE16(void *memPtr, U16 
> value)
>  * environments. This is needed when decompressing the Linux Kernel, for 
> example.
>  */
> #define LZ4_memcpy(dst, src, size) __builtin_memcpy(dst, src, size)
> +#define LZ4_memmove(dst, src, size) __builtin_memmove(dst, src, size)
> 
> static FORCE_INLINE void LZ4_copy8(void *dst, const void *src)
> {
> -- 
> 2.18.4
> 

Looks good to me! You can add:

Reviewed-by: Nick Terrell 



Re: [PATCH v5 1/9] lib: zstd: Add zstd compatibility wrapper

2020-11-10 Thread Nick Terrell


> On Nov 10, 2020, at 10:39 AM, Christoph Hellwig  wrote:
> 
> On Mon, Nov 09, 2020 at 02:01:41PM -0500, Chris Mason wrote:
>> You do consistently ask for a shim layer, but you haven???t explained what
>> we gain by diverging from the documented and tested API of the upstream zstd
>> project.  It???s an important discussion given that we hope to regularly
>> update the kernel side as they make improvements in zstd.
> 
> An API that looks like every other kernel API, and doesn't cause endless
> amount of churn because someone decided they need a new API flavor of
> the day.  Btw, I'm not asking for a shim layer - that was the compromise
> we ended up with.

I will put up a version of the patch set with the shim layer. I will follow the
kernel style guide for the shim, which will involve function renaming. I will
prefix the functions with “zstd_” instead of “ZSTD_” to make it clear that
this is not the upstream zstd API, but rather a kernel wrapper (and be closer
to the style guide).

Other than renaming to follow the kernel style guide, I will keep the API as
similar as possible to the existing API, to minimize churn.

Please let me know if you have any particular requests for the shim that I
haven't mentioned, or if you would prefer something else. Alternatively, 
comment on the patches once I put them up. Expect them later this week
or weekend.

Best,
Nick

> If zstd folks can't maintain a sane code base maybe we should just drop
> this childish churning code base from the tree.



Re: [PATCH v5 1/9] lib: zstd: Add zstd compatibility wrapper

2020-11-10 Thread Nick Terrell
> On Nov 10, 2020, at 7:25 AM, David Sterba  wrote:
> 
> On Mon, Nov 09, 2020 at 02:01:41PM -0500, Chris Mason wrote:
>> On 6 Nov 2020, at 13:38, Christoph Hellwig wrote:
>>> You just keep resedning this crap, don't you?  Haven't you been told
>>> multiple times to provide a proper kernel API by now?
>> 
>> You do consistently ask for a shim layer, but you haven’t explained 
>> what we gain by diverging from the documented and tested API of the 
>> upstream zstd project.  It’s an important discussion given that we 
>> hope to regularly update the kernel side as they make improvements in 
>> zstd.
>> 
>> The only benefit described so far seems to be camelcase related, but if 
>> there are problems in the API beyond that, I haven’t seen you describe 
>> them.  I don’t think the camelcase alone justifies the added costs of 
>> the shim.
> 
> The API change in this patchset is adding churn that wouldn't be
> necessary if there were an upstream<->kernel API from the beginning.
> 
> The patch 5/9 is almost entirely renaming just some internal identifiers
> 
> -   ZSTD_CStreamWorkspaceBound(params.cParams),
> -   ZSTD_DStreamWorkspaceBound(ZSTD_BTRFS_MAX_INPUT));
> +   
> ZSTD_estimateCStreamSize_usingCParams(params.cParams),
> +   ZSTD_estimateDStreamSize(ZSTD_BTRFS_MAX_INPUT));
> 
> plus updating the names in the error strings. The compression API that
> filesystems need is simple:
> 
> - set up workspace and parameters
> - compress buffer
> - decompress buffer
> 
> We really should not care if upstream has 3 functions for initializing
> stream (ZSTD_initCStream/ZSTD_initStaticCStream/ZSTD_initCStream_advanced),
> or if the name changes again in the future.

Upstream will not change these function names. We guarantee the stable
portion of our API has a fixed ABI. The unstable portion doesn’t make this
guarantee, but we guarantee never to change function semantics in an
incompatible way, and to provide long deprecation periods (years) when we
delete functions.

For reference, the only functions we’ve deleted/modified since v1.0.0 when we
stabilized the zstd format 4 years ago are an old streaming API that was
deprecated before v1.0.0. We’ve added new functions, and provided new
recommended ways to use our API that we think are better. But, we highly
value not breaking our users code, so all the older APIs are still supported.

This churn is caused because the current version of zstd inside the kernel is
not upstream zstd. At the time of integration upstream zstd wasn’t ready to
be used as-is in the kernel. When I integrated zstd into the kernel, I should’ve
done more work to use upstream as-is. It was a mistake that I would like to
correct, so the kernel can benefit from the significant performance
improvements that upstream has made in the last few years.

> This should not require explicit explanation, this should be a natural
> requirement especially for separate projects that don't share the same
> coding style but have to be integrated in some way.

I’m not completely against providing a kernel shim. Personally, I don’t believe
it provides much benefit. But if the consensus of kernel developers is that a
shim provides a better API, then I’m happy to provide it. So far, I haven’t seen
a clear consensus either way. That leaves me kind of stuck.

Best,
Nick



Re: [GIT PULL][PATCH v5 0/9] Update to zstd-1.4.6

2020-11-06 Thread Nick Terrell
> On Nov 6, 2020, at 9:15 AM, Josef Bacik  wrote:
> 
> On 11/3/20 1:05 AM, Nick Terrell wrote:
>> From: Nick Terrell 
>> Please pull from
>>   g...@github.com:terrelln/linux.git tags/v5-zstd-1.4.6
>> to get these changes. Alternatively the patchset is included.
> 
> Where did we come down on the code formatting question?  Personally I'm of 
> the mind that as long as the consumers themselves adhere to the proper coding 
> style I'm fine not maintaining the code style as long as we get the benefit 
> of easily syncing in code from the upstream project.  Thanks,

The general consensus of everyone who has been involved in the discussion so 
far, seems to be that the benefits of keeping zstd in-sync with upstream 
outweigh the cost of accepting upstream’s API, though not everyone agrees. The 
alternative is to provide a wrapper around upstream’s API, but this makes it 
slightly harder to debug, since you have to go through the wrapper whose only 
purpose is to adapt to the coding style, and allows bugs to sneak into the 
kernel implementation, which aren’t present upstream.

Additionally, in 2017 LZ4 switched to using upstream LZ4’s API in order to stay 
up to date with upstream, which sets precedent. I also help maintain LZ4, and 
once the zstd update is merged, I plan to work on making it easier to update 
LZ4 in the kernel when upstream updates. That will be a much smaller change, 
since LZ4 is already nearly using upstream’s code directly.

Best,
Nick

Re: [PATCH v5 5/9] btrfs: zstd: Switch to the zstd-1.4.6 API

2020-11-06 Thread Nick Terrell


> On Nov 6, 2020, at 9:10 AM, Josef Bacik  wrote:
> 
> On 11/3/20 1:05 AM, Nick Terrell wrote:
>> From: Nick Terrell 
>> Move away from the compatibility wrapper to the zstd-1.4.6 API. This
>> code is functionally equivalent.
>> Signed-off-by: Nick Terrell 
>> ---
>>  fs/btrfs/zstd.c | 48 
>>  1 file changed, 28 insertions(+), 20 deletions(-)
>> diff --git a/fs/btrfs/zstd.c b/fs/btrfs/zstd.c
>> index a7367ff573d4..6b466e090cd7 100644
>> --- a/fs/btrfs/zstd.c
>> +++ b/fs/btrfs/zstd.c
>> @@ -16,7 +16,7 @@
>>  #include 
>>  #include 
>>  #include 
>> -#include 
>> +#include 
>>  #include "misc.h"
>>  #include "compression.h"
>>  #include "ctree.h"
>> @@ -159,8 +159,8 @@ static void zstd_calc_ws_mem_sizes(void)
>>  zstd_get_btrfs_parameters(level, ZSTD_BTRFS_MAX_INPUT);
>>  size_t level_size =
>>  max_t(size_t,
>> -  ZSTD_CStreamWorkspaceBound(params.cParams),
>> -  ZSTD_DStreamWorkspaceBound(ZSTD_BTRFS_MAX_INPUT));
>> +  
>> ZSTD_estimateCStreamSize_usingCParams(params.cParams),
>> +  ZSTD_estimateDStreamSize(ZSTD_BTRFS_MAX_INPUT));
>>  max_size = max_t(size_t, max_size, level_size);
>>  zstd_ws_mem_sizes[level - 1] = max_size;
>> @@ -389,13 +389,23 @@ int zstd_compress_pages(struct list_head *ws, struct 
>> address_space *mapping,
>>  *total_in = 0;
>>  /* Initialize the stream */
>> -stream = ZSTD_initCStream(params, len, workspace->mem,
>> -workspace->size);
>> +stream = ZSTD_initStaticCStream(workspace->mem, workspace->size);
>>  if (!stream) {
>> -pr_warn("BTRFS: ZSTD_initCStream failed\n");
>> +pr_warn("BTRFS: ZSTD_initStaticCStream failed\n");
>>  ret = -EIO;
>>  goto out;
>>  }
>> +{
>> +size_t ret2;
>> +
>> +ret2 = ZSTD_initCStream_advanced(stream, NULL, 0, params, len);
>> +if (ZSTD_isError(ret2)) {
>> +pr_warn("BTRFS: ZSTD_initCStream_advanced returned 
>> %s\n",
>> +ZSTD_getErrorName(ret2));
>> +ret = -EIO;
>> +goto out;
>> +}
>> +}
> 
> Please don't do this, you can just add size_t ret2 at the top and not put 
> this in a block.  Other than that the code looks fine, once you fix that you 
> can add

Thanks for the review, I’ll make that change!

> Reviewed-by: Josef Bacik 
> 
> Thanks,
> 
> Josef



[PATCH v5 5/9] btrfs: zstd: Switch to the zstd-1.4.6 API

2020-11-02 Thread Nick Terrell
From: Nick Terrell 

Move away from the compatibility wrapper to the zstd-1.4.6 API. This
code is functionally equivalent.

Signed-off-by: Nick Terrell 
---
 fs/btrfs/zstd.c | 48 
 1 file changed, 28 insertions(+), 20 deletions(-)

diff --git a/fs/btrfs/zstd.c b/fs/btrfs/zstd.c
index a7367ff573d4..6b466e090cd7 100644
--- a/fs/btrfs/zstd.c
+++ b/fs/btrfs/zstd.c
@@ -16,7 +16,7 @@
 #include 
 #include 
 #include 
-#include 
+#include 
 #include "misc.h"
 #include "compression.h"
 #include "ctree.h"
@@ -159,8 +159,8 @@ static void zstd_calc_ws_mem_sizes(void)
zstd_get_btrfs_parameters(level, ZSTD_BTRFS_MAX_INPUT);
size_t level_size =
max_t(size_t,
- ZSTD_CStreamWorkspaceBound(params.cParams),
- ZSTD_DStreamWorkspaceBound(ZSTD_BTRFS_MAX_INPUT));
+ 
ZSTD_estimateCStreamSize_usingCParams(params.cParams),
+ ZSTD_estimateDStreamSize(ZSTD_BTRFS_MAX_INPUT));
 
max_size = max_t(size_t, max_size, level_size);
zstd_ws_mem_sizes[level - 1] = max_size;
@@ -389,13 +389,23 @@ int zstd_compress_pages(struct list_head *ws, struct 
address_space *mapping,
*total_in = 0;
 
/* Initialize the stream */
-   stream = ZSTD_initCStream(params, len, workspace->mem,
-   workspace->size);
+   stream = ZSTD_initStaticCStream(workspace->mem, workspace->size);
if (!stream) {
-   pr_warn("BTRFS: ZSTD_initCStream failed\n");
+   pr_warn("BTRFS: ZSTD_initStaticCStream failed\n");
ret = -EIO;
goto out;
}
+   {
+   size_t ret2;
+
+   ret2 = ZSTD_initCStream_advanced(stream, NULL, 0, params, len);
+   if (ZSTD_isError(ret2)) {
+   pr_warn("BTRFS: ZSTD_initCStream_advanced returned 
%s\n",
+   ZSTD_getErrorName(ret2));
+   ret = -EIO;
+   goto out;
+   }
+   }
 
/* map in the first page of input data */
in_page = find_get_page(mapping, start >> PAGE_SHIFT);
@@ -421,8 +431,8 @@ int zstd_compress_pages(struct list_head *ws, struct 
address_space *mapping,
ret2 = ZSTD_compressStream(stream, >out_buf,
>in_buf);
if (ZSTD_isError(ret2)) {
-   pr_debug("BTRFS: ZSTD_compressStream returned %d\n",
-   ZSTD_getErrorCode(ret2));
+   pr_debug("BTRFS: ZSTD_compressStream returned %s\n",
+   ZSTD_getErrorName(ret2));
ret = -EIO;
goto out;
}
@@ -489,8 +499,8 @@ int zstd_compress_pages(struct list_head *ws, struct 
address_space *mapping,
 
ret2 = ZSTD_endStream(stream, >out_buf);
if (ZSTD_isError(ret2)) {
-   pr_debug("BTRFS: ZSTD_endStream returned %d\n",
-   ZSTD_getErrorCode(ret2));
+   pr_debug("BTRFS: ZSTD_endStream returned %s\n",
+   ZSTD_getErrorName(ret2));
ret = -EIO;
goto out;
}
@@ -557,10 +567,9 @@ int zstd_decompress_bio(struct list_head *ws, struct 
compressed_bio *cb)
unsigned long buf_start;
unsigned long total_out = 0;
 
-   stream = ZSTD_initDStream(
-   ZSTD_BTRFS_MAX_INPUT, workspace->mem, workspace->size);
+   stream = ZSTD_initStaticDStream(workspace->mem, workspace->size);
if (!stream) {
-   pr_debug("BTRFS: ZSTD_initDStream failed\n");
+   pr_debug("BTRFS: ZSTD_initStaticDStream failed\n");
ret = -EIO;
goto done;
}
@@ -579,8 +588,8 @@ int zstd_decompress_bio(struct list_head *ws, struct 
compressed_bio *cb)
ret2 = ZSTD_decompressStream(stream, >out_buf,
>in_buf);
if (ZSTD_isError(ret2)) {
-   pr_debug("BTRFS: ZSTD_decompressStream returned %d\n",
-   ZSTD_getErrorCode(ret2));
+   pr_debug("BTRFS: ZSTD_decompressStream returned %s\n",
+   ZSTD_getErrorName(ret2));
ret = -EIO;
goto done;
}
@@ -633,10 +642,9 @@ int zstd_decompress(struct list_head *ws, unsigned char 
*data_in,
unsigned long pg_offset = 0;
char *kaddr;
 
-

[PATCH v5 8/9] lib: unzstd: Switch to the zstd-1.4.6 API

2020-11-02 Thread Nick Terrell
From: Nick Terrell 

Move away from the compatibility wrapper to the zstd-1.4.6 API. This
code is functionally equivalent.

Signed-off-by: Nick Terrell 
---
 lib/decompress_unzstd.c | 40 ++--
 1 file changed, 14 insertions(+), 26 deletions(-)

diff --git a/lib/decompress_unzstd.c b/lib/decompress_unzstd.c
index 3c6ad01ffcd5..efbe66501b34 100644
--- a/lib/decompress_unzstd.c
+++ b/lib/decompress_unzstd.c
@@ -73,7 +73,8 @@
 
 #include 
 #include 
-#include 
+#include 
+#include 
 
 /* 128MB is the maximum window size supported by zstd. */
 #define ZSTD_WINDOWSIZE_MAX(1 << ZSTD_WINDOWLOG_MAX)
@@ -120,9 +121,9 @@ static int INIT decompress_single(const u8 *in_buf, long 
in_len, u8 *out_buf,
  long out_len, long *in_pos,
  void (*error)(char *x))
 {
-   const size_t wksp_size = ZSTD_DCtxWorkspaceBound();
+   const size_t wksp_size = ZSTD_estimateDCtxSize();
void *wksp = large_malloc(wksp_size);
-   ZSTD_DCtx *dctx = ZSTD_initDCtx(wksp, wksp_size);
+   ZSTD_DCtx *dctx = ZSTD_initStaticDCtx(wksp, wksp_size);
int err;
size_t ret;
 
@@ -165,7 +166,6 @@ static int INIT __unzstd(unsigned char *in_buf, long in_len,
 {
ZSTD_inBuffer in;
ZSTD_outBuffer out;
-   ZSTD_frameParams params;
void *in_allocated = NULL;
void *out_allocated = NULL;
void *wksp = NULL;
@@ -234,36 +234,24 @@ static int INIT __unzstd(unsigned char *in_buf, long 
in_len,
out.size = out_len;
 
/*
-* We need to know the window size to allocate the ZSTD_DStream.
-* Since we are streaming, we need to allocate a buffer for the sliding
-* window. The window size varies from 1 KB to ZSTD_WINDOWSIZE_MAX
-* (8 MB), so it is important to use the actual value so as not to
-* waste memory when it is smaller.
+* Zstd determines the workspace size from the window size written
+* into the frame header. This ensures that we use the minimum value
+* possible, since the window size varies from 1 KB to 
ZSTD_WINDOWSIZE_MAX
+* (1 GB), so it is very important to use the actual value.
 */
-   ret = ZSTD_getFrameParams(, in.src, in.size);
+   wksp_size = ZSTD_estimateDStreamSize_fromFrame(in.src, in.size);
err = handle_zstd_error(ret, error);
if (err)
goto out;
-   if (ret != 0) {
-   error("ZSTD-compressed data has an incomplete frame header");
-   err = -1;
-   goto out;
-   }
-   if (params.windowSize > ZSTD_WINDOWSIZE_MAX) {
-   error("ZSTD-compressed data has too large a window size");
+   wksp = large_malloc(wksp_size);
+   if (wksp == NULL) {
+   error("Out of memory while allocating ZSTD_DStream");
err = -1;
goto out;
}
-
-   /*
-* Allocate the ZSTD_DStream now that we know how much memory is
-* required.
-*/
-   wksp_size = ZSTD_DStreamWorkspaceBound(params.windowSize);
-   wksp = large_malloc(wksp_size);
-   dstream = ZSTD_initDStream(params.windowSize, wksp, wksp_size);
+   dstream = ZSTD_initStaticDStream(wksp, wksp_size);
if (dstream == NULL) {
-   error("Out of memory while allocating ZSTD_DStream");
+   error("ZSTD_initStaticDStream failed");
err = -1;
goto out;
}
-- 
2.28.0



[PATCH v5 9/9] lib: zstd: Remove zstd compatibility wrapper

2020-11-02 Thread Nick Terrell
From: Nick Terrell 

All callers have been transitioned to the new zstd-1.4.6 API. There are
no more callers of the zstd compatibility wrapper, so delete it.

Signed-off-by: Nick Terrell 
---
 include/linux/zstd_compat.h | 116 
 1 file changed, 116 deletions(-)
 delete mode 100644 include/linux/zstd_compat.h

diff --git a/include/linux/zstd_compat.h b/include/linux/zstd_compat.h
deleted file mode 100644
index cda9208bf04a..
--- a/include/linux/zstd_compat.h
+++ /dev/null
@@ -1,116 +0,0 @@
-/*
- * Copyright (c) 2016-present, Facebook, Inc.
- * All rights reserved.
- *
- * This source code is licensed under the BSD-style license found in the
- * LICENSE file in the root directory of https://github.com/facebook/zstd.
- * An additional grant of patent rights can be found in the PATENTS file in the
- * same directory.
- *
- * This program is free software; you can redistribute it and/or modify it 
under
- * the terms of the GNU General Public License version 2 as published by the
- * Free Software Foundation. This program is dual-licensed; you may select
- * either version 2 of the GNU General Public License ("GPL") or BSD license
- * ("BSD").
- */
-
-#ifndef ZSTD_COMPAT_H
-#define ZSTD_COMPAT_H
-
-#include 
-
-#if defined(ZSTD_VERSION_NUMBER) && (ZSTD_VERSION_NUMBER >= 10406)
-/*
- * This header provides backwards compatibility for the zstd-1.4.6 library
- * upgrade. This header allows us to upgrade the zstd library version without
- * modifying any callers. Then we will migrate callers from the compatibility
- * wrapper one at a time until none remain. At which point we will delete this
- * header.
- *
- * It is temporary and will be deleted once the upgrade is complete.
- */
-
-#include 
-
-static inline size_t ZSTD_CCtxWorkspaceBound(ZSTD_compressionParameters 
compression_params)
-{
-return ZSTD_estimateCCtxSize_usingCParams(compression_params);
-}
-
-static inline size_t ZSTD_CStreamWorkspaceBound(ZSTD_compressionParameters 
compression_params)
-{
-return ZSTD_estimateCStreamSize_usingCParams(compression_params);
-}
-
-static inline size_t ZSTD_DCtxWorkspaceBound(void)
-{
-return ZSTD_estimateDCtxSize();
-}
-
-static inline size_t ZSTD_DStreamWorkspaceBound(unsigned long long window_size)
-{
-return ZSTD_estimateDStreamSize(window_size);
-}
-
-static inline ZSTD_CCtx* ZSTD_initCCtx(void* wksp, size_t wksp_size)
-{
-if (wksp == NULL)
-return NULL;
-return ZSTD_initStaticCCtx(wksp, wksp_size);
-}
-
-static inline ZSTD_CStream* ZSTD_initCStream_compat(ZSTD_parameters params, 
uint64_t pledged_src_size, void* wksp, size_t wksp_size)
-{
-ZSTD_CStream* cstream;
-size_t ret;
-
-if (wksp == NULL)
-return NULL;
-
-cstream = ZSTD_initStaticCStream(wksp, wksp_size);
-if (cstream == NULL)
-return NULL;
-
-/* 0 means unknown in old API but means 0 in new API */
-if (pledged_src_size == 0)
-pledged_src_size = ZSTD_CONTENTSIZE_UNKNOWN;
-
-ret = ZSTD_initCStream_advanced(cstream, NULL, 0, params, 
pledged_src_size);
-if (ZSTD_isError(ret))
-return NULL;
-
-return cstream;
-}
-#define ZSTD_initCStream ZSTD_initCStream_compat
-
-static inline ZSTD_DCtx* ZSTD_initDCtx(void* wksp, size_t wksp_size)
-{
-if (wksp == NULL)
-return NULL;
-return ZSTD_initStaticDCtx(wksp, wksp_size);
-}
-
-static inline ZSTD_DStream* ZSTD_initDStream_compat(unsigned long long 
window_size, void* wksp, size_t wksp_size)
-{
-if (wksp == NULL)
-return NULL;
-(void)window_size;
-return ZSTD_initStaticDStream(wksp, wksp_size);
-}
-#define ZSTD_initDStream ZSTD_initDStream_compat
-
-typedef ZSTD_frameHeader ZSTD_frameParams;
-
-static inline size_t ZSTD_getFrameParams(ZSTD_frameParams* frame_params, const 
void* src, size_t src_size)
-{
-return ZSTD_getFrameHeader(frame_params, src, src_size);
-}
-
-static inline size_t ZSTD_compressCCtx_compat(ZSTD_CCtx* cctx, void* dst, 
size_t dst_capacity, const void* src, size_t src_size, ZSTD_parameters params)
-{
-return ZSTD_compress_advanced(cctx, dst, dst_capacity, src, src_size, 
NULL, 0, params);
-}
-#define ZSTD_compressCCtx ZSTD_compressCCtx_compat
-
-#endif /* ZSTD_VERSION_NUMBER >= 10406 */
-#endif /* ZSTD_COMPAT_H */
-- 
2.28.0



[GIT PULL][PATCH v5 0/9] Update to zstd-1.4.6

2020-11-02 Thread Nick Terrell
From: Nick Terrell 

Please pull from

  g...@github.com:terrelln/linux.git tags/v5-zstd-1.4.6

to get these changes. Alternatively the patchset is included.

This patchset upgrades the zstd library to the latest upstream release. The
current zstd version in the kernel is a modified version of upstream zstd-1.3.1.
At the time it was integrated, zstd wasn't ready to be used in the kernel as-is.
But, it is now possible to use upstream zstd directly in the kernel.

I have not yet release zstd-1.4.6 upstream. I want the zstd version in the 
kernel
to match up with a known upstream release, so we know exactly what code is
running. Whenever this patchset is ready for merge, I will cut a release at the
upstream commit that gets merged. This should not be necessary for future
releases.

The kernel zstd library is automatically generated from upstream zstd. A script
makes the necessary changes and imports it into the kernel. The changes are:

1. Replace all libc dependencies with kernel replacements and rewrite includes.
2. Remove unncessary portability macros like: #if defined(_MSC_VER).
3. Use the kernel xxhash instead of bundling it.

This automation gets tested every commit by upstream's continuous integration.
When we cut a new zstd release, we will submit a patch to the kernel to update
the zstd version in the kernel.

I've updated zstd to upstream with one big patch because every commit must 
build,
so that precludes partial updates. Since the commit is 100% generated, I hope 
the
review burden is lightened. I considered replaying upstream commits, but that is
not possible because there have been ~3500 upstream commits since the last zstd
import, and the commits don't all build individually. The bulk update preserves
bisectablity because bugs can be bisected to the zstd version update. At that
point the update can be reverted, and we can work with upstream to find and fix
the bug. After this big switch in how the kernel consumes zstd, future patches
will be smaller, because they will only have one upstream release worth of
changes each.

This patchset changes the zstd API from a custom kernel API to the upstream API.
I considered wrapping the upstream API with a wrapper that is closer to the
kernel style guide. Following advise from https://lkml.org/lkml/2020/9/17/814
I've chosen to use the upstream API directly, to minimize opportunities to
introduce bugs, and because using the upstream API directly makes debugging and
communication with upstream easier.

This patchset comes in 3 parts:
1. The first 2 patches prepare for the zstd upgrade. The first patch adds a
   compatibility wrapper so zstd can be upgraded without modifying any callers.
   The second patch adds an indirection for the lib/decompress_unzstd.c 
including
   of all decompression source files.
2. Import zstd-1.4.6. This patch is completely generated from upstream using
   automated tooling.
3. Update all callers to the zstd-1.4.6 API then delete the compatibility
   wrapper.

I tested every caller of zstd on x86_64. I tested both after the 1.4.6 upgrade
using the compatibility wrapper, and after the final patch in this series. 

I tested kernel and initramfs decompression in i386 and arm.

I ran benchmarks to compare the current zstd in the kernel with zstd-1.4.6.
I benchmarked on x86_64 using QEMU with KVM enabled on an Intel i9-9900k.
I found:
* BtrFS zstd compression at levels 1 and 3 is 5% faster
* BtrFS zstd decompression+read is 15% faster
* SquashFS zstd decompression+read is 15% faster
* F2FS zstd compression+write at level 3 is 8% faster
* F2FS zstd decompression+read is 20% faster
* ZRAM decompression+read is 30% faster
* Kernel zstd decompression is 35% faster
* Initramfs zstd decompression+build is 5% faster

The latest zstd also offers bug fixes and a 1 KB reduction in stack uage during
compression. For example the recent problem with large kernel decompression has
been fixed upstream for over 2 years https://lkml.org/lkml/2020/9/29/27.

Please let me know if there is anything that I can do to ease the way for these
patches. I think it is important because it gets large performance improvements,
contains bug fixes, and is switching to a more maintainable model of consuming
upstream zstd directly, making it easy to keep up to date.

Best,
Nick Terrell

v1 -> v2:
* Successfully tested F2FS with help from Chao Yu to fix my test.
* (1/9) Fix ZSTD_initCStream() wrapper to handle pledged_src_size=0 means 
unknown.
  This fixes F2FS with the zstd-1.4.6 compatibility wrapper, exposed by the 
test.

v2 -> v3:
* (3/9) Silence warnings by Kernel Test Robot:
  https://github.com/facebook/zstd/pull/2324
  Stack size warnings remain, but these aren't new, and the functions it warns 
on
  are either unused or not in the maximum stack path. This patchset reduces zstd
  compression stack usage by 1 KB overall. I've gotten the low hanging fruit, 
and
  more stack reduction would require significant changes that have the potential
  to int

[PATCH v5 7/9] squashfs: zstd: Switch to the zstd-1.4.6 API

2020-11-02 Thread Nick Terrell
From: Nick Terrell 

Move away from the compatibility wrapper to the zstd-1.4.6 API. This
code is functionally equivalent.

Signed-off-by: Nick Terrell 
---
 fs/squashfs/zstd_wrapper.c | 9 -
 1 file changed, 4 insertions(+), 5 deletions(-)

diff --git a/fs/squashfs/zstd_wrapper.c b/fs/squashfs/zstd_wrapper.c
index f8c512a6204e..add582409866 100644
--- a/fs/squashfs/zstd_wrapper.c
+++ b/fs/squashfs/zstd_wrapper.c
@@ -11,7 +11,7 @@
 #include 
 #include 
 #include 
-#include 
+#include 
 #include 
 
 #include "squashfs_fs.h"
@@ -34,7 +34,7 @@ static void *zstd_init(struct squashfs_sb_info *msblk, void 
*buff)
goto failed;
wksp->window_size = max_t(size_t,
msblk->block_size, SQUASHFS_METADATA_SIZE);
-   wksp->mem_size = ZSTD_DStreamWorkspaceBound(wksp->window_size);
+   wksp->mem_size = ZSTD_estimateDStreamSize(wksp->window_size);
wksp->mem = vmalloc(wksp->mem_size);
if (wksp->mem == NULL)
goto failed;
@@ -71,7 +71,7 @@ static int zstd_uncompress(struct squashfs_sb_info *msblk, 
void *strm,
struct bvec_iter_all iter_all = {};
struct bio_vec *bvec = bvec_init_iter_all(_all);
 
-   stream = ZSTD_initDStream(wksp->window_size, wksp->mem, wksp->mem_size);
+   stream = ZSTD_initStaticDStream(wksp->mem, wksp->mem_size);
 
if (!stream) {
ERROR("Failed to initialize zstd decompressor\n");
@@ -122,8 +122,7 @@ static int zstd_uncompress(struct squashfs_sb_info *msblk, 
void *strm,
break;
 
if (ZSTD_isError(zstd_err)) {
-   ERROR("zstd decompression error: %d\n",
-   (int)ZSTD_getErrorCode(zstd_err));
+   ERROR("zstd decompression error: %s\n", 
ZSTD_getErrorName(zstd_err));
error = -EIO;
break;
}
-- 
2.28.0



[PATCH v5 6/9] f2fs: zstd: Switch to the zstd-1.4.6 API

2020-11-02 Thread Nick Terrell
From: Nick Terrell 

Move away from the compatibility wrapper to the zstd-1.4.6 API. This
code is more efficient because it uses the single-pass API instead of
the streaming API. The streaming API is not necessary because the whole
input and output buffers are available. This saves memory because we
don't need to allocate a buffer for the window. It is also more
efficient because it saves unnecessary memcpy calls.

Compression memory increases from 168 KB to 204 KB because upstream
uses slightly more memory. Decompression memory decreases from 1.4 MB
to 158 KB.

Signed-off-by: Nick Terrell 
---
 fs/f2fs/compress.c | 101 +
 1 file changed, 37 insertions(+), 64 deletions(-)

diff --git a/fs/f2fs/compress.c b/fs/f2fs/compress.c
index 57a6360b9827..8f8234877666 100644
--- a/fs/f2fs/compress.c
+++ b/fs/f2fs/compress.c
@@ -11,7 +11,8 @@
 #include 
 #include 
 #include 
-#include 
+#include 
+#include 
 
 #include "f2fs.h"
 #include "node.h"
@@ -322,21 +323,21 @@ static const struct f2fs_compress_ops f2fs_lz4_ops = {
 static int zstd_init_compress_ctx(struct compress_ctx *cc)
 {
ZSTD_parameters params;
-   ZSTD_CStream *stream;
+   ZSTD_CCtx *ctx;
void *workspace;
unsigned int workspace_size;
 
params = ZSTD_getParams(F2FS_ZSTD_DEFAULT_CLEVEL, cc->rlen, 0);
-   workspace_size = ZSTD_CStreamWorkspaceBound(params.cParams);
+   workspace_size = ZSTD_estimateCCtxSize_usingCParams(params.cParams);
 
workspace = f2fs_kvmalloc(F2FS_I_SB(cc->inode),
workspace_size, GFP_NOFS);
if (!workspace)
return -ENOMEM;
 
-   stream = ZSTD_initCStream(params, 0, workspace, workspace_size);
-   if (!stream) {
-   printk_ratelimited("%sF2FS-fs (%s): %s ZSTD_initCStream 
failed\n",
+   ctx = ZSTD_initStaticCCtx(workspace, workspace_size);
+   if (!ctx) {
+   printk_ratelimited("%sF2FS-fs (%s): %s ZSTD_inittaticCStream 
failed\n",
KERN_ERR, F2FS_I_SB(cc->inode)->sb->s_id,
__func__);
kvfree(workspace);
@@ -344,7 +345,7 @@ static int zstd_init_compress_ctx(struct compress_ctx *cc)
}
 
cc->private = workspace;
-   cc->private2 = stream;
+   cc->private2 = ctx;
 
cc->clen = cc->rlen - PAGE_SIZE - COMPRESS_HEADER_SIZE;
return 0;
@@ -359,66 +360,48 @@ static void zstd_destroy_compress_ctx(struct compress_ctx 
*cc)
 
 static int zstd_compress_pages(struct compress_ctx *cc)
 {
-   ZSTD_CStream *stream = cc->private2;
-   ZSTD_inBuffer inbuf;
-   ZSTD_outBuffer outbuf;
-   int src_size = cc->rlen;
-   int dst_size = src_size - PAGE_SIZE - COMPRESS_HEADER_SIZE;
-   int ret;
-
-   inbuf.pos = 0;
-   inbuf.src = cc->rbuf;
-   inbuf.size = src_size;
-
-   outbuf.pos = 0;
-   outbuf.dst = cc->cbuf->cdata;
-   outbuf.size = dst_size;
+   ZSTD_CCtx *ctx = cc->private2;
+   const size_t src_size = cc->rlen;
+   const size_t dst_size = src_size - PAGE_SIZE - COMPRESS_HEADER_SIZE;
+   ZSTD_parameters params = ZSTD_getParams(F2FS_ZSTD_DEFAULT_CLEVEL, 
src_size, 0);
+   size_t ret;
 
-   ret = ZSTD_compressStream(stream, , );
+   ret = ZSTD_compress_advanced(
+   ctx, cc->cbuf->cdata, dst_size, cc->rbuf, src_size, 
NULL, 0, params);
if (ZSTD_isError(ret)) {
-   printk_ratelimited("%sF2FS-fs (%s): %s ZSTD_compressStream 
failed, ret: %d\n",
-   KERN_ERR, F2FS_I_SB(cc->inode)->sb->s_id,
-   __func__, ZSTD_getErrorCode(ret));
-   return -EIO;
-   }
-
-   ret = ZSTD_endStream(stream, );
-   if (ZSTD_isError(ret)) {
-   printk_ratelimited("%sF2FS-fs (%s): %s ZSTD_endStream returned 
%d\n",
+   /*
+* there is compressed data remained in intermediate buffer due 
to
+* no more space in cbuf.cdata
+*/
+   if (ZSTD_getErrorCode(ret) == ZSTD_error_dstSize_tooSmall)
+   return -EAGAIN;
+   /* other compression errors return -EIO */
+   printk_ratelimited("%sF2FS-fs (%s): %s ZSTD_compress_advanced 
failed, err: %s\n",
KERN_ERR, F2FS_I_SB(cc->inode)->sb->s_id,
-   __func__, ZSTD_getErrorCode(ret));
+   __func__, ZSTD_getErrorName(ret));
return -EIO;
}
 
-   /*
-* there is compressed data remained in intermediate buffer due to
-* no more space in cbuf.cdata
-*/
-   if (ret)
-   return -EAGAIN;
-
-   cc->clen = outbuf.

[PATCH v5 4/9] crypto: zstd: Switch to zstd-1.4.6 API

2020-11-02 Thread Nick Terrell
From: Nick Terrell 

Move away from the compatibility wrapper to the zstd-1.4.6 API. This
code is functionally equivalent.

Signed-off-by: Nick Terrell 
---
 crypto/zstd.c | 24 +++-
 1 file changed, 11 insertions(+), 13 deletions(-)

diff --git a/crypto/zstd.c b/crypto/zstd.c
index dcda3cad3b5c..767fe2fbe009 100644
--- a/crypto/zstd.c
+++ b/crypto/zstd.c
@@ -11,7 +11,7 @@
 #include 
 #include 
 #include 
-#include 
+#include 
 #include 
 
 
@@ -24,16 +24,15 @@ struct zstd_ctx {
void *dwksp;
 };
 
-static ZSTD_parameters zstd_params(void)
-{
-   return ZSTD_getParams(ZSTD_DEF_LEVEL, 0, 0);
-}
-
 static int zstd_comp_init(struct zstd_ctx *ctx)
 {
int ret = 0;
-   const ZSTD_parameters params = zstd_params();
-   const size_t wksp_size = ZSTD_CCtxWorkspaceBound(params.cParams);
+   const size_t wksp_size = ZSTD_estimateCCtxSize(ZSTD_DEF_LEVEL);
+
+   if (ZSTD_isError(wksp_size)) {
+   ret = -EINVAL;
+   goto out_free;
+   }
 
ctx->cwksp = vzalloc(wksp_size);
if (!ctx->cwksp) {
@@ -41,7 +40,7 @@ static int zstd_comp_init(struct zstd_ctx *ctx)
goto out;
}
 
-   ctx->cctx = ZSTD_initCCtx(ctx->cwksp, wksp_size);
+   ctx->cctx = ZSTD_initStaticCCtx(ctx->cwksp, wksp_size);
if (!ctx->cctx) {
ret = -EINVAL;
goto out_free;
@@ -56,7 +55,7 @@ static int zstd_comp_init(struct zstd_ctx *ctx)
 static int zstd_decomp_init(struct zstd_ctx *ctx)
 {
int ret = 0;
-   const size_t wksp_size = ZSTD_DCtxWorkspaceBound();
+   const size_t wksp_size = ZSTD_estimateDCtxSize();
 
ctx->dwksp = vzalloc(wksp_size);
if (!ctx->dwksp) {
@@ -64,7 +63,7 @@ static int zstd_decomp_init(struct zstd_ctx *ctx)
goto out;
}
 
-   ctx->dctx = ZSTD_initDCtx(ctx->dwksp, wksp_size);
+   ctx->dctx = ZSTD_initStaticDCtx(ctx->dwksp, wksp_size);
if (!ctx->dctx) {
ret = -EINVAL;
goto out_free;
@@ -152,9 +151,8 @@ static int __zstd_compress(const u8 *src, unsigned int slen,
 {
size_t out_len;
struct zstd_ctx *zctx = ctx;
-   const ZSTD_parameters params = zstd_params();
 
-   out_len = ZSTD_compressCCtx(zctx->cctx, dst, *dlen, src, slen, params);
+   out_len = ZSTD_compressCCtx(zctx->cctx, dst, *dlen, src, slen, 
ZSTD_DEF_LEVEL);
if (ZSTD_isError(out_len))
return -EINVAL;
*dlen = out_len;
-- 
2.28.0



[PATCH v5 2/9] lib: zstd: Add decompress_sources.h for decompress_unzstd

2020-11-02 Thread Nick Terrell
From: Nick Terrell 

Adds decompress_sources.h which includes every .c file necessary for
zstd decompression. This is used in decompress_unzstd.c so the internal
structure of the library isn't exposed.

This allows us to upgrade the zstd library version without modifying any
callers. Instead we just need to update decompress_sources.h.

Signed-off-by: Nick Terrell 
---
 lib/decompress_unzstd.c   |  6 +-
 lib/zstd/decompress_sources.h | 14 ++
 2 files changed, 15 insertions(+), 5 deletions(-)
 create mode 100644 lib/zstd/decompress_sources.h

diff --git a/lib/decompress_unzstd.c b/lib/decompress_unzstd.c
index 6bb805aeec08..3c6ad01ffcd5 100644
--- a/lib/decompress_unzstd.c
+++ b/lib/decompress_unzstd.c
@@ -68,11 +68,7 @@
 #ifdef STATIC
 # define UNZSTD_PREBOOT
 # include "xxhash.c"
-# include "zstd/entropy_common.c"
-# include "zstd/fse_decompress.c"
-# include "zstd/huf_decompress.c"
-# include "zstd/zstd_common.c"
-# include "zstd/decompress.c"
+# include "zstd/decompress_sources.h"
 #endif
 
 #include 
diff --git a/lib/zstd/decompress_sources.h b/lib/zstd/decompress_sources.h
new file mode 100644
index ..ccb4960ea0cd
--- /dev/null
+++ b/lib/zstd/decompress_sources.h
@@ -0,0 +1,14 @@
+// SPDX-License-Identifier: GPL-2.0-only
+
+/*
+ * This file includes every .c file needed for decompression.
+ * It is used by lib/decompress_unzstd.c to include the decompression
+ * source into the translation-unit, so it can be used for kernel
+ * decompression.
+ */
+
+#include "entropy_common.c"
+#include "fse_decompress.c"
+#include "huf_decompress.c"
+#include "zstd_common.c"
+#include "decompress.c"
-- 
2.28.0



[PATCH v5 1/9] lib: zstd: Add zstd compatibility wrapper

2020-11-02 Thread Nick Terrell
From: Nick Terrell 

Adds zstd_compat.h which provides the necessary functions from the
current zstd.h API. It is only active for zstd versions 1.4.6 and newer.
That means it is disabled currently, but will become active when a later
patch in this series updates the zstd library in the kernel to 1.4.6.

This header allows the zstd upgrade to 1.4.6 without changing any
callers, since they all include zstd through the compatibility wrapper.
Later patches in this series transition each caller away from the
compatibility wrapper. After all the callers have been transitioned away
from the compatibility wrapper, the final patch in this series deletes
it.

Signed-off-by: Nick Terrell 
---
 crypto/zstd.c   |   2 +-
 fs/btrfs/zstd.c |   2 +-
 fs/f2fs/compress.c  |   2 +-
 fs/squashfs/zstd_wrapper.c  |   2 +-
 include/linux/zstd_compat.h | 116 
 lib/decompress_unzstd.c |   2 +-
 6 files changed, 121 insertions(+), 5 deletions(-)
 create mode 100644 include/linux/zstd_compat.h

diff --git a/crypto/zstd.c b/crypto/zstd.c
index 1a3309f066f7..dcda3cad3b5c 100644
--- a/crypto/zstd.c
+++ b/crypto/zstd.c
@@ -11,7 +11,7 @@
 #include 
 #include 
 #include 
-#include 
+#include 
 #include 
 
 
diff --git a/fs/btrfs/zstd.c b/fs/btrfs/zstd.c
index 9a4871636c6c..a7367ff573d4 100644
--- a/fs/btrfs/zstd.c
+++ b/fs/btrfs/zstd.c
@@ -16,7 +16,7 @@
 #include 
 #include 
 #include 
-#include 
+#include 
 #include "misc.h"
 #include "compression.h"
 #include "ctree.h"
diff --git a/fs/f2fs/compress.c b/fs/f2fs/compress.c
index 14262e0f1cd6..57a6360b9827 100644
--- a/fs/f2fs/compress.c
+++ b/fs/f2fs/compress.c
@@ -11,7 +11,7 @@
 #include 
 #include 
 #include 
-#include 
+#include 
 
 #include "f2fs.h"
 #include "node.h"
diff --git a/fs/squashfs/zstd_wrapper.c b/fs/squashfs/zstd_wrapper.c
index b7cb1faa652d..f8c512a6204e 100644
--- a/fs/squashfs/zstd_wrapper.c
+++ b/fs/squashfs/zstd_wrapper.c
@@ -11,7 +11,7 @@
 #include 
 #include 
 #include 
-#include 
+#include 
 #include 
 
 #include "squashfs_fs.h"
diff --git a/include/linux/zstd_compat.h b/include/linux/zstd_compat.h
new file mode 100644
index ..cda9208bf04a
--- /dev/null
+++ b/include/linux/zstd_compat.h
@@ -0,0 +1,116 @@
+/*
+ * Copyright (c) 2016-present, Facebook, Inc.
+ * All rights reserved.
+ *
+ * This source code is licensed under the BSD-style license found in the
+ * LICENSE file in the root directory of https://github.com/facebook/zstd.
+ * An additional grant of patent rights can be found in the PATENTS file in the
+ * same directory.
+ *
+ * This program is free software; you can redistribute it and/or modify it 
under
+ * the terms of the GNU General Public License version 2 as published by the
+ * Free Software Foundation. This program is dual-licensed; you may select
+ * either version 2 of the GNU General Public License ("GPL") or BSD license
+ * ("BSD").
+ */
+
+#ifndef ZSTD_COMPAT_H
+#define ZSTD_COMPAT_H
+
+#include 
+
+#if defined(ZSTD_VERSION_NUMBER) && (ZSTD_VERSION_NUMBER >= 10406)
+/*
+ * This header provides backwards compatibility for the zstd-1.4.6 library
+ * upgrade. This header allows us to upgrade the zstd library version without
+ * modifying any callers. Then we will migrate callers from the compatibility
+ * wrapper one at a time until none remain. At which point we will delete this
+ * header.
+ *
+ * It is temporary and will be deleted once the upgrade is complete.
+ */
+
+#include 
+
+static inline size_t ZSTD_CCtxWorkspaceBound(ZSTD_compressionParameters 
compression_params)
+{
+return ZSTD_estimateCCtxSize_usingCParams(compression_params);
+}
+
+static inline size_t ZSTD_CStreamWorkspaceBound(ZSTD_compressionParameters 
compression_params)
+{
+return ZSTD_estimateCStreamSize_usingCParams(compression_params);
+}
+
+static inline size_t ZSTD_DCtxWorkspaceBound(void)
+{
+return ZSTD_estimateDCtxSize();
+}
+
+static inline size_t ZSTD_DStreamWorkspaceBound(unsigned long long window_size)
+{
+return ZSTD_estimateDStreamSize(window_size);
+}
+
+static inline ZSTD_CCtx* ZSTD_initCCtx(void* wksp, size_t wksp_size)
+{
+if (wksp == NULL)
+return NULL;
+return ZSTD_initStaticCCtx(wksp, wksp_size);
+}
+
+static inline ZSTD_CStream* ZSTD_initCStream_compat(ZSTD_parameters params, 
uint64_t pledged_src_size, void* wksp, size_t wksp_size)
+{
+ZSTD_CStream* cstream;
+size_t ret;
+
+if (wksp == NULL)
+return NULL;
+
+cstream = ZSTD_initStaticCStream(wksp, wksp_size);
+if (cstream == NULL)
+return NULL;
+
+/* 0 means unknown in old API but means 0 in new API */
+if (pledged_src_size == 0)
+pledged_src_size = ZSTD_CONTENTSIZE_UNKNOWN;
+
+ret = ZSTD_initCStream_advanced(cstream, NULL, 0, params, 
pledged_src_size);
+if (ZSTD_isError(ret))
+return NULL;
+
+return cstream;
+}
+#define ZS

Re: [GIT PULL][PATCH v4 0/9] Update to zstd-1.4.6

2020-10-01 Thread Nick Terrell


> On Oct 1, 2020, at 3:18 AM, David Sterba  wrote:
> 
> On Wed, Sep 30, 2020 at 08:49:49PM +0000, Nick Terrell wrote:
>>> On Sep 29, 2020, at 11:53 PM, Nick Terrell  wrote:
>>> 
>>> From: Nick Terrell 
>> 
>> It has been brought to my attention that patch 3 hasn’t made it to patchwork,
>> likely because it is too large. I’ll include a pull request in the next 
>> cover letter,
>> together with the patches (if needed).
> 
> The patch 3/9 saved to a file is 1.6M, over 35000 lines, the diffstat
> says:
> 
> 66 files changed, 24268 insertions(+), 12889 deletions(-)
> 
> Seriously, this is wrong in so many ways. There's the rationale for
> one-time change etc, but the actual result is beyond what I would accept
> and would not encourage anyone to merge as-is.

I’m open to suggestions on how to get a zstd update done better. I don’t
know of any way to break this patch up into smaller patches that all compile.
The code is all generated directly from upstream and modified to work in the
kernel by automated scripts.

I think the benefits of updating zstd are pretty clear: bug fixes, 3 years of 
testing,
features, debuggability, support from zstd upstream, and significant performance
improvements.

So I hope we can come up with a way forward to get this merged.

This large of a patch is a one-time change. But, the zstd updates in general
will be large, containing 100s of commits worth of changes (as opposed to
~3500 and a structure change in this diff). E.g. the upstream diff between
two upstream versions range from 50KB - 500KB. Zstd is an actively
maintained project, so there is going to be churn when consuming it. But it
also means that we’re actively supporting the project if any problems occur.

My view is that kernel developers don’t need to review upstreams zstd’s code. We
should focus on the diff from upstream, and ensuring that everything works in 
the
kernel environment. The imported code from upstream zstd is ~30K LOC, which is
too large for anyone to reasonably review.

As mentioned in the patch, this commit shows the diff from upstream zstd, which
is much more manageable:

https://github.com/terrelln/linux/commit/467c9ea1df1100db48c020c3c8b282a2a30f5116

I’ve generated it by importing upstream zstd as-is into the kernel file 
structure. Then
running the automation to generate the kernel patch from upstream and importing
it into the kernel on top of the upstream patch.

Best,
Nick

Re: [GIT PULL][PATCH v4 0/9] Update to zstd-1.4.6

2020-09-30 Thread Nick Terrell
> On Sep 29, 2020, at 11:53 PM, Nick Terrell  wrote:
> 
> From: Nick Terrell 

It has been brought to my attention that patch 3 hasn’t made it to patchwork,
likely because it is too large. I’ll include a pull request in the next cover 
letter,
together with the patches (if needed).

Please pull from

 g...@github.com:terrelln/linux.git tags/v4-zstd-1.4.6

to get these changes.

> This patchset upgrades the zstd library to the latest upstream release. The
> current zstd version in the kernel is a modified version of upstream 
> zstd-1.3.1.
> At the time it was integrated, zstd wasn't ready to be used in the kernel 
> as-is.
> But, it is now possible to use upstream zstd directly in the kernel.
> 
> I have not yet release zstd-1.4.6 upstream. I want the zstd version in the 
> kernel
> to match up with a known upstream release, so we know exactly what code is
> running. Whenever this patchset is ready for merge, I will cut a release at 
> the
> upstream commit that gets merged. This should not be necessary for future
> releases.
> 
> The kernel zstd library is automatically generated from upstream zstd. A 
> script
> makes the necessary changes and imports it into the kernel. The changes are:
> 
> 1. Replace all libc dependencies with kernel replacements and rewrite 
> includes.
> 2. Remove unncessary portability macros like: #if defined(_MSC_VER).
> 3. Use the kernel xxhash instead of bundling it.
> 
> This automation gets tested every commit by upstream's continuous integration.
> When we cut a new zstd release, we will submit a patch to the kernel to update
> the zstd version in the kernel.
> 
> I've updated zstd to upstream with one big patch because every commit must 
> build,
> so that precludes partial updates. Since the commit is 100% generated, I hope 
> the
> review burden is lightened. I considered replaying upstream commits, but that 
> is
> not possible because there have been ~3500 upstream commits since the last 
> zstd
> import, and the commits don't all build individually. The bulk update 
> preserves
> bisectablity because bugs can be bisected to the zstd version update. At that
> point the update can be reverted, and we can work with upstream to find and 
> fix
> the bug. After this big switch in how the kernel consumes zstd, future patches
> will be smaller, because they will only have one upstream release worth of
> changes each.
> 
> This patchset changes the zstd API from a custom kernel API to the upstream 
> API.
> I considered wrapping the upstream API with a wrapper that is closer to the
> kernel style guide. Following advise from https://lkml.org/lkml/2020/9/17/814
> I've chosen to use the upstream API directly, to minimize opportunities to
> introduce bugs, and because using the upstream API directly makes debugging 
> and
> communication with upstream easier.
> 
> This patchset comes in 3 parts:
> 1. The first 2 patches prepare for the zstd upgrade. The first patch adds a
>   compatibility wrapper so zstd can be upgraded without modifying any callers.
>   The second patch adds an indirection for the lib/decompress_unzstd.c 
> including
>   of all decompression source files.
> 2. Import zstd-1.4.6. This patch is completely generated from upstream using
>   automated tooling.
> 3. Update all callers to the zstd-1.4.6 API then delete the compatibility
>   wrapper.
> 
> I tested every caller of zstd on x86_64. I tested both after the 1.4.6 upgrade
> using the compatibility wrapper, and after the final patch in this series. 
> 
> I tested kernel and initramfs decompression in i386 and arm.
> 
> I ran benchmarks to compare the current zstd in the kernel with zstd-1.4.6.
> I benchmarked on x86_64 using QEMU with KVM enabled on an Intel i9-9900k.
> I found:
> * BtrFS zstd compression at levels 1 and 3 is 5% faster
> * BtrFS zstd decompression+read is 15% faster
> * SquashFS zstd decompression+read is 15% faster
> * F2FS zstd compression+write at level 3 is 8% faster
> * F2FS zstd decompression+read is 20% faster
> * ZRAM decompression+read is 30% faster
> * Kernel zstd decompression is 35% faster
> * Initramfs zstd decompression+build is 5% faster
> 
> The latest zstd also offers bug fixes and a 1 KB reduction in stack uage 
> during
> compression. For example the recent problem with large kernel decompression 
> has
> been fixed upstream for over 2 years https://lkml.org/lkml/2020/9/29/27.
> 
> Please let me know if there is anything that I can do to ease the way for 
> these
> patches. I think it is important because it gets large performance 
> improvements,
> contains bug fixes, and is switching to a more maintainable model of consuming
> upstream zstd directly, making it easy to keep up to date.

Re: [PATCH v4 0/9] Update to zstd-1.4.6

2020-09-30 Thread Nick Terrell


> On Sep 29, 2020, at 11:53 PM, Christoph Hellwig  wrote:
> 
> As you keep resend this I keep retelling you that should not do it.
> Please provide a proper Linux API, and switch to that.  Versioned APIs
> have absolutely no business in the Linux kernel.

The API is not versioned. We provide a stable ABI for a large section of our 
API,
and the parts that aren’t ABI stable don’t change in semantics, and undergo long
deprecation periods before being removed.

The change of callers is a one-time change to transition from the existing API
in the kernel, which was never upstream's API, to upstream's API.

-Nick

> On Tue, Sep 29, 2020 at 11:53:09PM -0700, Nick Terrell wrote:
>> From: Nick Terrell 
>> 
>> This patchset upgrades the zstd library to the latest upstream release. The
>> current zstd version in the kernel is a modified version of upstream 
>> zstd-1.3.1.
>> At the time it was integrated, zstd wasn't ready to be used in the kernel 
>> as-is.
>> But, it is now possible to use upstream zstd directly in the kernel.
>> 
>> I have not yet release zstd-1.4.6 upstream. I want the zstd version in the 
>> kernel
>> to match up with a known upstream release, so we know exactly what code is
>> running. Whenever this patchset is ready for merge, I will cut a release at 
>> the
>> upstream commit that gets merged. This should not be necessary for future
>> releases.
>> 
>> The kernel zstd library is automatically generated from upstream zstd. A 
>> script
>> makes the necessary changes and imports it into the kernel. The changes are:
>> 
>> 1. Replace all libc dependencies with kernel replacements and rewrite 
>> includes.
>> 2. Remove unncessary portability macros like: #if defined(_MSC_VER).
>> 3. Use the kernel xxhash instead of bundling it.
>> 
>> This automation gets tested every commit by upstream's continuous 
>> integration.
>> When we cut a new zstd release, we will submit a patch to the kernel to 
>> update
>> the zstd version in the kernel.
>> 
>> I've updated zstd to upstream with one big patch because every commit must 
>> build,
>> so that precludes partial updates. Since the commit is 100% generated, I 
>> hope the
>> review burden is lightened. I considered replaying upstream commits, but 
>> that is
>> not possible because there have been ~3500 upstream commits since the last 
>> zstd
>> import, and the commits don't all build individually. The bulk update 
>> preserves
>> bisectablity because bugs can be bisected to the zstd version update. At that
>> point the update can be reverted, and we can work with upstream to find and 
>> fix
>> the bug. After this big switch in how the kernel consumes zstd, future 
>> patches
>> will be smaller, because they will only have one upstream release worth of
>> changes each.
>> 
>> This patchset changes the zstd API from a custom kernel API to the upstream 
>> API.
>> I considered wrapping the upstream API with a wrapper that is closer to the
>> kernel style guide. Following advise from https://lkml.org/lkml/2020/9/17/814
>> I've chosen to use the upstream API directly, to minimize opportunities to
>> introduce bugs, and because using the upstream API directly makes debugging 
>> and
>> communication with upstream easier.
>> 
>> This patchset comes in 3 parts:
>> 1. The first 2 patches prepare for the zstd upgrade. The first patch adds a
>>   compatibility wrapper so zstd can be upgraded without modifying any 
>> callers.
>>   The second patch adds an indirection for the lib/decompress_unzstd.c 
>> including
>>   of all decompression source files.
>> 2. Import zstd-1.4.6. This patch is completely generated from upstream using
>>   automated tooling.
>> 3. Update all callers to the zstd-1.4.6 API then delete the compatibility
>>   wrapper.
>> 
>> I tested every caller of zstd on x86_64. I tested both after the 1.4.6 
>> upgrade
>> using the compatibility wrapper, and after the final patch in this series. 
>> 
>> I tested kernel and initramfs decompression in i386 and arm.
>> 
>> I ran benchmarks to compare the current zstd in the kernel with zstd-1.4.6.
>> I benchmarked on x86_64 using QEMU with KVM enabled on an Intel i9-9900k.
>> I found:
>> * BtrFS zstd compression at levels 1 and 3 is 5% faster
>> * BtrFS zstd decompression+read is 15% faster
>> * SquashFS zstd decompression+read is 15% faster
>> * F2FS zstd compression+write at level 3 is 8% faster
>> * F2FS zstd decompression+read is 20% faster
>> * ZRAM decompressi

[PATCH v4 8/9] lib: unzstd: Switch to the zstd-1.4.6 API

2020-09-30 Thread Nick Terrell
From: Nick Terrell 

Move away from the compatibility wrapper to the zstd-1.4.6 API. This
code is functionally equivalent.

Signed-off-by: Nick Terrell 
---
 lib/decompress_unzstd.c | 40 ++--
 1 file changed, 14 insertions(+), 26 deletions(-)

diff --git a/lib/decompress_unzstd.c b/lib/decompress_unzstd.c
index a79f705f236d..d4685df0e120 100644
--- a/lib/decompress_unzstd.c
+++ b/lib/decompress_unzstd.c
@@ -73,7 +73,8 @@
 
 #include 
 #include 
-#include 
+#include 
+#include 
 
 /* 128MB is the maximum window size supported by zstd. */
 #define ZSTD_WINDOWSIZE_MAX(1 << ZSTD_WINDOWLOG_MAX)
@@ -120,9 +121,9 @@ static int INIT decompress_single(const u8 *in_buf, long 
in_len, u8 *out_buf,
  long out_len, long *in_pos,
  void (*error)(char *x))
 {
-   const size_t wksp_size = ZSTD_DCtxWorkspaceBound();
+   const size_t wksp_size = ZSTD_estimateDCtxSize();
void *wksp = large_malloc(wksp_size);
-   ZSTD_DCtx *dctx = ZSTD_initDCtx(wksp, wksp_size);
+   ZSTD_DCtx *dctx = ZSTD_initStaticDCtx(wksp, wksp_size);
int err;
size_t ret;
 
@@ -165,7 +166,6 @@ static int INIT __unzstd(unsigned char *in_buf, long in_len,
 {
ZSTD_inBuffer in;
ZSTD_outBuffer out;
-   ZSTD_frameParams params;
void *in_allocated = NULL;
void *out_allocated = NULL;
void *wksp = NULL;
@@ -229,36 +229,24 @@ static int INIT __unzstd(unsigned char *in_buf, long 
in_len,
out.size = out_len;
 
/*
-* We need to know the window size to allocate the ZSTD_DStream.
-* Since we are streaming, we need to allocate a buffer for the sliding
-* window. The window size varies from 1 KB to ZSTD_WINDOWSIZE_MAX
-* (8 MB), so it is important to use the actual value so as not to
-* waste memory when it is smaller.
+* Zstd determines the workspace size from the window size written
+* into the frame header. This ensures that we use the minimum value
+* possible, since the window size varies from 1 KB to 
ZSTD_WINDOWSIZE_MAX
+* (1 GB), so it is very important to use the actual value.
 */
-   ret = ZSTD_getFrameParams(, in.src, in.size);
+   wksp_size = ZSTD_estimateDStreamSize_fromFrame(in.src, in.size);
err = handle_zstd_error(ret, error);
if (err)
goto out;
-   if (ret != 0) {
-   error("ZSTD-compressed data has an incomplete frame header");
-   err = -1;
-   goto out;
-   }
-   if (params.windowSize > ZSTD_WINDOWSIZE_MAX) {
-   error("ZSTD-compressed data has too large a window size");
+   wksp = large_malloc(wksp_size);
+   if (wksp == NULL) {
+   error("Out of memory while allocating ZSTD_DStream");
err = -1;
goto out;
}
-
-   /*
-* Allocate the ZSTD_DStream now that we know how much memory is
-* required.
-*/
-   wksp_size = ZSTD_DStreamWorkspaceBound(params.windowSize);
-   wksp = large_malloc(wksp_size);
-   dstream = ZSTD_initDStream(params.windowSize, wksp, wksp_size);
+   dstream = ZSTD_initStaticDStream(wksp, wksp_size);
if (dstream == NULL) {
-   error("Out of memory while allocating ZSTD_DStream");
+   error("ZSTD_initStaticDStream failed");
err = -1;
goto out;
}
-- 
2.28.0



[PATCH v4 9/9] lib: zstd: Remove zstd compatibility wrapper

2020-09-30 Thread Nick Terrell
From: Nick Terrell 

All callers have been transitioned to the new zstd-1.4.6 API. There are
no more callers of the zstd compatibility wrapper, so delete it.

Signed-off-by: Nick Terrell 
---
 include/linux/zstd_compat.h | 116 
 1 file changed, 116 deletions(-)
 delete mode 100644 include/linux/zstd_compat.h

diff --git a/include/linux/zstd_compat.h b/include/linux/zstd_compat.h
deleted file mode 100644
index cda9208bf04a..
--- a/include/linux/zstd_compat.h
+++ /dev/null
@@ -1,116 +0,0 @@
-/*
- * Copyright (c) 2016-present, Facebook, Inc.
- * All rights reserved.
- *
- * This source code is licensed under the BSD-style license found in the
- * LICENSE file in the root directory of https://github.com/facebook/zstd.
- * An additional grant of patent rights can be found in the PATENTS file in the
- * same directory.
- *
- * This program is free software; you can redistribute it and/or modify it 
under
- * the terms of the GNU General Public License version 2 as published by the
- * Free Software Foundation. This program is dual-licensed; you may select
- * either version 2 of the GNU General Public License ("GPL") or BSD license
- * ("BSD").
- */
-
-#ifndef ZSTD_COMPAT_H
-#define ZSTD_COMPAT_H
-
-#include 
-
-#if defined(ZSTD_VERSION_NUMBER) && (ZSTD_VERSION_NUMBER >= 10406)
-/*
- * This header provides backwards compatibility for the zstd-1.4.6 library
- * upgrade. This header allows us to upgrade the zstd library version without
- * modifying any callers. Then we will migrate callers from the compatibility
- * wrapper one at a time until none remain. At which point we will delete this
- * header.
- *
- * It is temporary and will be deleted once the upgrade is complete.
- */
-
-#include 
-
-static inline size_t ZSTD_CCtxWorkspaceBound(ZSTD_compressionParameters 
compression_params)
-{
-return ZSTD_estimateCCtxSize_usingCParams(compression_params);
-}
-
-static inline size_t ZSTD_CStreamWorkspaceBound(ZSTD_compressionParameters 
compression_params)
-{
-return ZSTD_estimateCStreamSize_usingCParams(compression_params);
-}
-
-static inline size_t ZSTD_DCtxWorkspaceBound(void)
-{
-return ZSTD_estimateDCtxSize();
-}
-
-static inline size_t ZSTD_DStreamWorkspaceBound(unsigned long long window_size)
-{
-return ZSTD_estimateDStreamSize(window_size);
-}
-
-static inline ZSTD_CCtx* ZSTD_initCCtx(void* wksp, size_t wksp_size)
-{
-if (wksp == NULL)
-return NULL;
-return ZSTD_initStaticCCtx(wksp, wksp_size);
-}
-
-static inline ZSTD_CStream* ZSTD_initCStream_compat(ZSTD_parameters params, 
uint64_t pledged_src_size, void* wksp, size_t wksp_size)
-{
-ZSTD_CStream* cstream;
-size_t ret;
-
-if (wksp == NULL)
-return NULL;
-
-cstream = ZSTD_initStaticCStream(wksp, wksp_size);
-if (cstream == NULL)
-return NULL;
-
-/* 0 means unknown in old API but means 0 in new API */
-if (pledged_src_size == 0)
-pledged_src_size = ZSTD_CONTENTSIZE_UNKNOWN;
-
-ret = ZSTD_initCStream_advanced(cstream, NULL, 0, params, 
pledged_src_size);
-if (ZSTD_isError(ret))
-return NULL;
-
-return cstream;
-}
-#define ZSTD_initCStream ZSTD_initCStream_compat
-
-static inline ZSTD_DCtx* ZSTD_initDCtx(void* wksp, size_t wksp_size)
-{
-if (wksp == NULL)
-return NULL;
-return ZSTD_initStaticDCtx(wksp, wksp_size);
-}
-
-static inline ZSTD_DStream* ZSTD_initDStream_compat(unsigned long long 
window_size, void* wksp, size_t wksp_size)
-{
-if (wksp == NULL)
-return NULL;
-(void)window_size;
-return ZSTD_initStaticDStream(wksp, wksp_size);
-}
-#define ZSTD_initDStream ZSTD_initDStream_compat
-
-typedef ZSTD_frameHeader ZSTD_frameParams;
-
-static inline size_t ZSTD_getFrameParams(ZSTD_frameParams* frame_params, const 
void* src, size_t src_size)
-{
-return ZSTD_getFrameHeader(frame_params, src, src_size);
-}
-
-static inline size_t ZSTD_compressCCtx_compat(ZSTD_CCtx* cctx, void* dst, 
size_t dst_capacity, const void* src, size_t src_size, ZSTD_parameters params)
-{
-return ZSTD_compress_advanced(cctx, dst, dst_capacity, src, src_size, 
NULL, 0, params);
-}
-#define ZSTD_compressCCtx ZSTD_compressCCtx_compat
-
-#endif /* ZSTD_VERSION_NUMBER >= 10406 */
-#endif /* ZSTD_COMPAT_H */
-- 
2.28.0



[PATCH v4 5/9] btrfs: zstd: Switch to the zstd-1.4.6 API

2020-09-30 Thread Nick Terrell
From: Nick Terrell 

Move away from the compatibility wrapper to the zstd-1.4.6 API. This
code is functionally equivalent.

Signed-off-by: Nick Terrell 
---
 fs/btrfs/zstd.c | 48 
 1 file changed, 28 insertions(+), 20 deletions(-)

diff --git a/fs/btrfs/zstd.c b/fs/btrfs/zstd.c
index a7367ff573d4..6b466e090cd7 100644
--- a/fs/btrfs/zstd.c
+++ b/fs/btrfs/zstd.c
@@ -16,7 +16,7 @@
 #include 
 #include 
 #include 
-#include 
+#include 
 #include "misc.h"
 #include "compression.h"
 #include "ctree.h"
@@ -159,8 +159,8 @@ static void zstd_calc_ws_mem_sizes(void)
zstd_get_btrfs_parameters(level, ZSTD_BTRFS_MAX_INPUT);
size_t level_size =
max_t(size_t,
- ZSTD_CStreamWorkspaceBound(params.cParams),
- ZSTD_DStreamWorkspaceBound(ZSTD_BTRFS_MAX_INPUT));
+ 
ZSTD_estimateCStreamSize_usingCParams(params.cParams),
+ ZSTD_estimateDStreamSize(ZSTD_BTRFS_MAX_INPUT));
 
max_size = max_t(size_t, max_size, level_size);
zstd_ws_mem_sizes[level - 1] = max_size;
@@ -389,13 +389,23 @@ int zstd_compress_pages(struct list_head *ws, struct 
address_space *mapping,
*total_in = 0;
 
/* Initialize the stream */
-   stream = ZSTD_initCStream(params, len, workspace->mem,
-   workspace->size);
+   stream = ZSTD_initStaticCStream(workspace->mem, workspace->size);
if (!stream) {
-   pr_warn("BTRFS: ZSTD_initCStream failed\n");
+   pr_warn("BTRFS: ZSTD_initStaticCStream failed\n");
ret = -EIO;
goto out;
}
+   {
+   size_t ret2;
+
+   ret2 = ZSTD_initCStream_advanced(stream, NULL, 0, params, len);
+   if (ZSTD_isError(ret2)) {
+   pr_warn("BTRFS: ZSTD_initCStream_advanced returned 
%s\n",
+   ZSTD_getErrorName(ret2));
+   ret = -EIO;
+   goto out;
+   }
+   }
 
/* map in the first page of input data */
in_page = find_get_page(mapping, start >> PAGE_SHIFT);
@@ -421,8 +431,8 @@ int zstd_compress_pages(struct list_head *ws, struct 
address_space *mapping,
ret2 = ZSTD_compressStream(stream, >out_buf,
>in_buf);
if (ZSTD_isError(ret2)) {
-   pr_debug("BTRFS: ZSTD_compressStream returned %d\n",
-   ZSTD_getErrorCode(ret2));
+   pr_debug("BTRFS: ZSTD_compressStream returned %s\n",
+   ZSTD_getErrorName(ret2));
ret = -EIO;
goto out;
}
@@ -489,8 +499,8 @@ int zstd_compress_pages(struct list_head *ws, struct 
address_space *mapping,
 
ret2 = ZSTD_endStream(stream, >out_buf);
if (ZSTD_isError(ret2)) {
-   pr_debug("BTRFS: ZSTD_endStream returned %d\n",
-   ZSTD_getErrorCode(ret2));
+   pr_debug("BTRFS: ZSTD_endStream returned %s\n",
+   ZSTD_getErrorName(ret2));
ret = -EIO;
goto out;
}
@@ -557,10 +567,9 @@ int zstd_decompress_bio(struct list_head *ws, struct 
compressed_bio *cb)
unsigned long buf_start;
unsigned long total_out = 0;
 
-   stream = ZSTD_initDStream(
-   ZSTD_BTRFS_MAX_INPUT, workspace->mem, workspace->size);
+   stream = ZSTD_initStaticDStream(workspace->mem, workspace->size);
if (!stream) {
-   pr_debug("BTRFS: ZSTD_initDStream failed\n");
+   pr_debug("BTRFS: ZSTD_initStaticDStream failed\n");
ret = -EIO;
goto done;
}
@@ -579,8 +588,8 @@ int zstd_decompress_bio(struct list_head *ws, struct 
compressed_bio *cb)
ret2 = ZSTD_decompressStream(stream, >out_buf,
>in_buf);
if (ZSTD_isError(ret2)) {
-   pr_debug("BTRFS: ZSTD_decompressStream returned %d\n",
-   ZSTD_getErrorCode(ret2));
+   pr_debug("BTRFS: ZSTD_decompressStream returned %s\n",
+   ZSTD_getErrorName(ret2));
ret = -EIO;
goto done;
}
@@ -633,10 +642,9 @@ int zstd_decompress(struct list_head *ws, unsigned char 
*data_in,
unsigned long pg_offset = 0;
char *kaddr;
 
-

[PATCH v4 7/9] squashfs: zstd: Switch to the zstd-1.4.6 API

2020-09-30 Thread Nick Terrell
From: Nick Terrell 

Move away from the compatibility wrapper to the zstd-1.4.6 API. This
code is functionally equivalent.

Signed-off-by: Nick Terrell 
---
 fs/squashfs/zstd_wrapper.c | 9 -
 1 file changed, 4 insertions(+), 5 deletions(-)

diff --git a/fs/squashfs/zstd_wrapper.c b/fs/squashfs/zstd_wrapper.c
index f8c512a6204e..add582409866 100644
--- a/fs/squashfs/zstd_wrapper.c
+++ b/fs/squashfs/zstd_wrapper.c
@@ -11,7 +11,7 @@
 #include 
 #include 
 #include 
-#include 
+#include 
 #include 
 
 #include "squashfs_fs.h"
@@ -34,7 +34,7 @@ static void *zstd_init(struct squashfs_sb_info *msblk, void 
*buff)
goto failed;
wksp->window_size = max_t(size_t,
msblk->block_size, SQUASHFS_METADATA_SIZE);
-   wksp->mem_size = ZSTD_DStreamWorkspaceBound(wksp->window_size);
+   wksp->mem_size = ZSTD_estimateDStreamSize(wksp->window_size);
wksp->mem = vmalloc(wksp->mem_size);
if (wksp->mem == NULL)
goto failed;
@@ -71,7 +71,7 @@ static int zstd_uncompress(struct squashfs_sb_info *msblk, 
void *strm,
struct bvec_iter_all iter_all = {};
struct bio_vec *bvec = bvec_init_iter_all(_all);
 
-   stream = ZSTD_initDStream(wksp->window_size, wksp->mem, wksp->mem_size);
+   stream = ZSTD_initStaticDStream(wksp->mem, wksp->mem_size);
 
if (!stream) {
ERROR("Failed to initialize zstd decompressor\n");
@@ -122,8 +122,7 @@ static int zstd_uncompress(struct squashfs_sb_info *msblk, 
void *strm,
break;
 
if (ZSTD_isError(zstd_err)) {
-   ERROR("zstd decompression error: %d\n",
-   (int)ZSTD_getErrorCode(zstd_err));
+   ERROR("zstd decompression error: %s\n", 
ZSTD_getErrorName(zstd_err));
error = -EIO;
break;
}
-- 
2.28.0



[PATCH v4 6/9] f2fs: zstd: Switch to the zstd-1.4.6 API

2020-09-30 Thread Nick Terrell
From: Nick Terrell 

Move away from the compatibility wrapper to the zstd-1.4.6 API. This
code is more efficient because it uses the single-pass API instead of
the streaming API. The streaming API is not necessary because the whole
input and output buffers are available. This saves memory because we
don't need to allocate a buffer for the window. It is also more
efficient because it saves unnecessary memcpy calls.

Compression memory increases from 168 KB to 204 KB because upstream
uses slightly more memory. Decompression memory decreases from 1.4 MB
to 158 KB.

Signed-off-by: Nick Terrell 
---
 fs/f2fs/compress.c | 102 +
 1 file changed, 38 insertions(+), 64 deletions(-)

diff --git a/fs/f2fs/compress.c b/fs/f2fs/compress.c
index e056f3a2b404..b79efce81651 100644
--- a/fs/f2fs/compress.c
+++ b/fs/f2fs/compress.c
@@ -11,7 +11,8 @@
 #include 
 #include 
 #include 
-#include 
+#include 
+#include 
 
 #include "f2fs.h"
 #include "node.h"
@@ -298,21 +299,21 @@ static const struct f2fs_compress_ops f2fs_lz4_ops = {
 static int zstd_init_compress_ctx(struct compress_ctx *cc)
 {
ZSTD_parameters params;
-   ZSTD_CStream *stream;
+   ZSTD_CCtx *ctx;
void *workspace;
unsigned int workspace_size;
 
params = ZSTD_getParams(F2FS_ZSTD_DEFAULT_CLEVEL, cc->rlen, 0);
-   workspace_size = ZSTD_CStreamWorkspaceBound(params.cParams);
+   workspace_size = ZSTD_estimateCCtxSize_usingCParams(params.cParams);
 
workspace = f2fs_kvmalloc(F2FS_I_SB(cc->inode),
workspace_size, GFP_NOFS);
if (!workspace)
return -ENOMEM;
 
-   stream = ZSTD_initCStream(params, 0, workspace, workspace_size);
-   if (!stream) {
-   printk_ratelimited("%sF2FS-fs (%s): %s ZSTD_initCStream 
failed\n",
+   ctx = ZSTD_initStaticCCtx(workspace, workspace_size);
+   if (!ctx) {
+   printk_ratelimited("%sF2FS-fs (%s): %s ZSTD_inittaticCStream 
failed\n",
KERN_ERR, F2FS_I_SB(cc->inode)->sb->s_id,
__func__);
kvfree(workspace);
@@ -320,7 +321,7 @@ static int zstd_init_compress_ctx(struct compress_ctx *cc)
}
 
cc->private = workspace;
-   cc->private2 = stream;
+   cc->private2 = ctx;
 
cc->clen = cc->rlen - PAGE_SIZE - COMPRESS_HEADER_SIZE;
return 0;
@@ -335,65 +336,48 @@ static void zstd_destroy_compress_ctx(struct compress_ctx 
*cc)
 
 static int zstd_compress_pages(struct compress_ctx *cc)
 {
-   ZSTD_CStream *stream = cc->private2;
-   ZSTD_inBuffer inbuf;
-   ZSTD_outBuffer outbuf;
-   int src_size = cc->rlen;
-   int dst_size = src_size - PAGE_SIZE - COMPRESS_HEADER_SIZE;
-   int ret;
-
-   inbuf.pos = 0;
-   inbuf.src = cc->rbuf;
-   inbuf.size = src_size;
-
-   outbuf.pos = 0;
-   outbuf.dst = cc->cbuf->cdata;
-   outbuf.size = dst_size;
-
-   ret = ZSTD_compressStream(stream, , );
-   if (ZSTD_isError(ret)) {
-   printk_ratelimited("%sF2FS-fs (%s): %s ZSTD_compressStream 
failed, ret: %d\n",
-   KERN_ERR, F2FS_I_SB(cc->inode)->sb->s_id,
-   __func__, ZSTD_getErrorCode(ret));
-   return -EIO;
-   }
-
-   ret = ZSTD_endStream(stream, );
+   ZSTD_CCtx *ctx = cc->private2;
+   const size_t src_size = cc->rlen;
+   const size_t dst_size = src_size - PAGE_SIZE - COMPRESS_HEADER_SIZE;
+   ZSTD_parameters params = ZSTD_getParams(F2FS_ZSTD_DEFAULT_CLEVEL, 
src_size, 0);
+   size_t ret;
+
+   ret = ZSTD_compress_advanced(
+   ctx, cc->cbuf->cdata, dst_size, cc->rbuf, src_size, 
NULL, 0, params);
if (ZSTD_isError(ret)) {
-   printk_ratelimited("%sF2FS-fs (%s): %s ZSTD_endStream returned 
%d\n",
+   /*
+* there is compressed data remained in intermediate buffer due 
to
+* no more space in cbuf.cdata
+*/
+   if (ZSTD_getErrorCode(ret) == ZSTD_error_dstSize_tooSmall)
+   return -EAGAIN;
+   /* other compression errors return -EIO */
+   printk_ratelimited("%sF2FS-fs (%s): %s ZSTD_compress_advanced 
failed, err: %s\n",
KERN_ERR, F2FS_I_SB(cc->inode)->sb->s_id,
-   __func__, ZSTD_getErrorCode(ret));
+   __func__, ZSTD_getErrorName(ret));
return -EIO;
}
 
-   /*
-* there is compressed data remained in intermediate buffer due to
-* no more space in cbuf.cdata
-*/
-   if (ret)
-   return -EAGAIN;
-
-   cc->clen = outbuf.

[PATCH v4 4/9] crypto: zstd: Switch to zstd-1.4.6 API

2020-09-30 Thread Nick Terrell
From: Nick Terrell 

Move away from the compatibility wrapper to the zstd-1.4.6 API. This
code is functionally equivalent.

Signed-off-by: Nick Terrell 
---
 crypto/zstd.c | 24 +++-
 1 file changed, 11 insertions(+), 13 deletions(-)

diff --git a/crypto/zstd.c b/crypto/zstd.c
index dcda3cad3b5c..767fe2fbe009 100644
--- a/crypto/zstd.c
+++ b/crypto/zstd.c
@@ -11,7 +11,7 @@
 #include 
 #include 
 #include 
-#include 
+#include 
 #include 
 
 
@@ -24,16 +24,15 @@ struct zstd_ctx {
void *dwksp;
 };
 
-static ZSTD_parameters zstd_params(void)
-{
-   return ZSTD_getParams(ZSTD_DEF_LEVEL, 0, 0);
-}
-
 static int zstd_comp_init(struct zstd_ctx *ctx)
 {
int ret = 0;
-   const ZSTD_parameters params = zstd_params();
-   const size_t wksp_size = ZSTD_CCtxWorkspaceBound(params.cParams);
+   const size_t wksp_size = ZSTD_estimateCCtxSize(ZSTD_DEF_LEVEL);
+
+   if (ZSTD_isError(wksp_size)) {
+   ret = -EINVAL;
+   goto out_free;
+   }
 
ctx->cwksp = vzalloc(wksp_size);
if (!ctx->cwksp) {
@@ -41,7 +40,7 @@ static int zstd_comp_init(struct zstd_ctx *ctx)
goto out;
}
 
-   ctx->cctx = ZSTD_initCCtx(ctx->cwksp, wksp_size);
+   ctx->cctx = ZSTD_initStaticCCtx(ctx->cwksp, wksp_size);
if (!ctx->cctx) {
ret = -EINVAL;
goto out_free;
@@ -56,7 +55,7 @@ static int zstd_comp_init(struct zstd_ctx *ctx)
 static int zstd_decomp_init(struct zstd_ctx *ctx)
 {
int ret = 0;
-   const size_t wksp_size = ZSTD_DCtxWorkspaceBound();
+   const size_t wksp_size = ZSTD_estimateDCtxSize();
 
ctx->dwksp = vzalloc(wksp_size);
if (!ctx->dwksp) {
@@ -64,7 +63,7 @@ static int zstd_decomp_init(struct zstd_ctx *ctx)
goto out;
}
 
-   ctx->dctx = ZSTD_initDCtx(ctx->dwksp, wksp_size);
+   ctx->dctx = ZSTD_initStaticDCtx(ctx->dwksp, wksp_size);
if (!ctx->dctx) {
ret = -EINVAL;
goto out_free;
@@ -152,9 +151,8 @@ static int __zstd_compress(const u8 *src, unsigned int slen,
 {
size_t out_len;
struct zstd_ctx *zctx = ctx;
-   const ZSTD_parameters params = zstd_params();
 
-   out_len = ZSTD_compressCCtx(zctx->cctx, dst, *dlen, src, slen, params);
+   out_len = ZSTD_compressCCtx(zctx->cctx, dst, *dlen, src, slen, 
ZSTD_DEF_LEVEL);
if (ZSTD_isError(out_len))
return -EINVAL;
*dlen = out_len;
-- 
2.28.0



[PATCH v4 1/9] lib: zstd: Add zstd compatibility wrapper

2020-09-30 Thread Nick Terrell
From: Nick Terrell 

Adds zstd_compat.h which provides the necessary functions from the
current zstd.h API. It is only active for zstd versions 1.4.6 and newer.
That means it is disabled currently, but will become active when a later
patch in this series updates the zstd library in the kernel to 1.4.6.

This header allows the zstd upgrade to 1.4.6 without changing any
callers, since they all include zstd through the compatibility wrapper.
Later patches in this series transition each caller away from the
compatibility wrapper. After all the callers have been transitioned away
from the compatibility wrapper, the final patch in this series deletes
it.

Signed-off-by: Nick Terrell 
---
 crypto/zstd.c   |   2 +-
 fs/btrfs/zstd.c |   2 +-
 fs/f2fs/compress.c  |   2 +-
 fs/squashfs/zstd_wrapper.c  |   2 +-
 include/linux/zstd_compat.h | 116 
 lib/decompress_unzstd.c |   2 +-
 6 files changed, 121 insertions(+), 5 deletions(-)
 create mode 100644 include/linux/zstd_compat.h

diff --git a/crypto/zstd.c b/crypto/zstd.c
index 1a3309f066f7..dcda3cad3b5c 100644
--- a/crypto/zstd.c
+++ b/crypto/zstd.c
@@ -11,7 +11,7 @@
 #include 
 #include 
 #include 
-#include 
+#include 
 #include 
 
 
diff --git a/fs/btrfs/zstd.c b/fs/btrfs/zstd.c
index 9a4871636c6c..a7367ff573d4 100644
--- a/fs/btrfs/zstd.c
+++ b/fs/btrfs/zstd.c
@@ -16,7 +16,7 @@
 #include 
 #include 
 #include 
-#include 
+#include 
 #include "misc.h"
 #include "compression.h"
 #include "ctree.h"
diff --git a/fs/f2fs/compress.c b/fs/f2fs/compress.c
index 1dfb126a0cb2..e056f3a2b404 100644
--- a/fs/f2fs/compress.c
+++ b/fs/f2fs/compress.c
@@ -11,7 +11,7 @@
 #include 
 #include 
 #include 
-#include 
+#include 
 
 #include "f2fs.h"
 #include "node.h"
diff --git a/fs/squashfs/zstd_wrapper.c b/fs/squashfs/zstd_wrapper.c
index b7cb1faa652d..f8c512a6204e 100644
--- a/fs/squashfs/zstd_wrapper.c
+++ b/fs/squashfs/zstd_wrapper.c
@@ -11,7 +11,7 @@
 #include 
 #include 
 #include 
-#include 
+#include 
 #include 
 
 #include "squashfs_fs.h"
diff --git a/include/linux/zstd_compat.h b/include/linux/zstd_compat.h
new file mode 100644
index ..cda9208bf04a
--- /dev/null
+++ b/include/linux/zstd_compat.h
@@ -0,0 +1,116 @@
+/*
+ * Copyright (c) 2016-present, Facebook, Inc.
+ * All rights reserved.
+ *
+ * This source code is licensed under the BSD-style license found in the
+ * LICENSE file in the root directory of https://github.com/facebook/zstd.
+ * An additional grant of patent rights can be found in the PATENTS file in the
+ * same directory.
+ *
+ * This program is free software; you can redistribute it and/or modify it 
under
+ * the terms of the GNU General Public License version 2 as published by the
+ * Free Software Foundation. This program is dual-licensed; you may select
+ * either version 2 of the GNU General Public License ("GPL") or BSD license
+ * ("BSD").
+ */
+
+#ifndef ZSTD_COMPAT_H
+#define ZSTD_COMPAT_H
+
+#include 
+
+#if defined(ZSTD_VERSION_NUMBER) && (ZSTD_VERSION_NUMBER >= 10406)
+/*
+ * This header provides backwards compatibility for the zstd-1.4.6 library
+ * upgrade. This header allows us to upgrade the zstd library version without
+ * modifying any callers. Then we will migrate callers from the compatibility
+ * wrapper one at a time until none remain. At which point we will delete this
+ * header.
+ *
+ * It is temporary and will be deleted once the upgrade is complete.
+ */
+
+#include 
+
+static inline size_t ZSTD_CCtxWorkspaceBound(ZSTD_compressionParameters 
compression_params)
+{
+return ZSTD_estimateCCtxSize_usingCParams(compression_params);
+}
+
+static inline size_t ZSTD_CStreamWorkspaceBound(ZSTD_compressionParameters 
compression_params)
+{
+return ZSTD_estimateCStreamSize_usingCParams(compression_params);
+}
+
+static inline size_t ZSTD_DCtxWorkspaceBound(void)
+{
+return ZSTD_estimateDCtxSize();
+}
+
+static inline size_t ZSTD_DStreamWorkspaceBound(unsigned long long window_size)
+{
+return ZSTD_estimateDStreamSize(window_size);
+}
+
+static inline ZSTD_CCtx* ZSTD_initCCtx(void* wksp, size_t wksp_size)
+{
+if (wksp == NULL)
+return NULL;
+return ZSTD_initStaticCCtx(wksp, wksp_size);
+}
+
+static inline ZSTD_CStream* ZSTD_initCStream_compat(ZSTD_parameters params, 
uint64_t pledged_src_size, void* wksp, size_t wksp_size)
+{
+ZSTD_CStream* cstream;
+size_t ret;
+
+if (wksp == NULL)
+return NULL;
+
+cstream = ZSTD_initStaticCStream(wksp, wksp_size);
+if (cstream == NULL)
+return NULL;
+
+/* 0 means unknown in old API but means 0 in new API */
+if (pledged_src_size == 0)
+pledged_src_size = ZSTD_CONTENTSIZE_UNKNOWN;
+
+ret = ZSTD_initCStream_advanced(cstream, NULL, 0, params, 
pledged_src_size);
+if (ZSTD_isError(ret))
+return NULL;
+
+return cstream;
+}
+#define ZS

[PATCH v4 0/9] Update to zstd-1.4.6

2020-09-30 Thread Nick Terrell
From: Nick Terrell 

This patchset upgrades the zstd library to the latest upstream release. The
current zstd version in the kernel is a modified version of upstream zstd-1.3.1.
At the time it was integrated, zstd wasn't ready to be used in the kernel as-is.
But, it is now possible to use upstream zstd directly in the kernel.

I have not yet release zstd-1.4.6 upstream. I want the zstd version in the 
kernel
to match up with a known upstream release, so we know exactly what code is
running. Whenever this patchset is ready for merge, I will cut a release at the
upstream commit that gets merged. This should not be necessary for future
releases.

The kernel zstd library is automatically generated from upstream zstd. A script
makes the necessary changes and imports it into the kernel. The changes are:

1. Replace all libc dependencies with kernel replacements and rewrite includes.
2. Remove unncessary portability macros like: #if defined(_MSC_VER).
3. Use the kernel xxhash instead of bundling it.

This automation gets tested every commit by upstream's continuous integration.
When we cut a new zstd release, we will submit a patch to the kernel to update
the zstd version in the kernel.

I've updated zstd to upstream with one big patch because every commit must 
build,
so that precludes partial updates. Since the commit is 100% generated, I hope 
the
review burden is lightened. I considered replaying upstream commits, but that is
not possible because there have been ~3500 upstream commits since the last zstd
import, and the commits don't all build individually. The bulk update preserves
bisectablity because bugs can be bisected to the zstd version update. At that
point the update can be reverted, and we can work with upstream to find and fix
the bug. After this big switch in how the kernel consumes zstd, future patches
will be smaller, because they will only have one upstream release worth of
changes each.

This patchset changes the zstd API from a custom kernel API to the upstream API.
I considered wrapping the upstream API with a wrapper that is closer to the
kernel style guide. Following advise from https://lkml.org/lkml/2020/9/17/814
I've chosen to use the upstream API directly, to minimize opportunities to
introduce bugs, and because using the upstream API directly makes debugging and
communication with upstream easier.

This patchset comes in 3 parts:
1. The first 2 patches prepare for the zstd upgrade. The first patch adds a
   compatibility wrapper so zstd can be upgraded without modifying any callers.
   The second patch adds an indirection for the lib/decompress_unzstd.c 
including
   of all decompression source files.
2. Import zstd-1.4.6. This patch is completely generated from upstream using
   automated tooling.
3. Update all callers to the zstd-1.4.6 API then delete the compatibility
   wrapper.

I tested every caller of zstd on x86_64. I tested both after the 1.4.6 upgrade
using the compatibility wrapper, and after the final patch in this series. 

I tested kernel and initramfs decompression in i386 and arm.

I ran benchmarks to compare the current zstd in the kernel with zstd-1.4.6.
I benchmarked on x86_64 using QEMU with KVM enabled on an Intel i9-9900k.
I found:
* BtrFS zstd compression at levels 1 and 3 is 5% faster
* BtrFS zstd decompression+read is 15% faster
* SquashFS zstd decompression+read is 15% faster
* F2FS zstd compression+write at level 3 is 8% faster
* F2FS zstd decompression+read is 20% faster
* ZRAM decompression+read is 30% faster
* Kernel zstd decompression is 35% faster
* Initramfs zstd decompression+build is 5% faster

The latest zstd also offers bug fixes and a 1 KB reduction in stack uage during
compression. For example the recent problem with large kernel decompression has
been fixed upstream for over 2 years https://lkml.org/lkml/2020/9/29/27.

Please let me know if there is anything that I can do to ease the way for these
patches. I think it is important because it gets large performance improvements,
contains bug fixes, and is switching to a more maintainable model of consuming
upstream zstd directly, making it easy to keep up to date.

Best,
Nick Terrell

v1 -> v2:
* Successfully tested F2FS with help from Chao Yu to fix my test.
* (1/9) Fix ZSTD_initCStream() wrapper to handle pledged_src_size=0 means 
unknown.
  This fixes F2FS with the zstd-1.4.6 compatibility wrapper, exposed by the 
test.

v2 -> v3:
* (3/9) Silence warnings by Kernel Test Robot:
  https://github.com/facebook/zstd/pull/2324
  Stack size warnings remain, but these aren't new, and the functions it warns 
on
  are either unused or not in the maximum stack path. This patchset reduces zstd
  compression stack usage by 1 KB overall. I've gotten the low hanging fruit, 
and
  more stack reduction would require significant changes that have the potential
  to introduce new bugs. However, I do hope to continue to reduce zstd stack
  usage in future versions.

v3 -> v4:
* (3/9) Fix errors and

[PATCH v4 2/9] lib: zstd: Add decompress_sources.h for decompress_unzstd

2020-09-30 Thread Nick Terrell
From: Nick Terrell 

Adds decompress_sources.h which includes every .c file necessary for
zstd decompression. This is used in decompress_unzstd.c so the internal
structure of the library isn't exposed.

This allows us to upgrade the zstd library version without modifying any
callers. Instead we just need to update decompress_sources.h.

Signed-off-by: Nick Terrell 
---
 lib/decompress_unzstd.c   |  6 +-
 lib/zstd/decompress_sources.h | 14 ++
 2 files changed, 15 insertions(+), 5 deletions(-)
 create mode 100644 lib/zstd/decompress_sources.h

diff --git a/lib/decompress_unzstd.c b/lib/decompress_unzstd.c
index dbc290af26b4..a79f705f236d 100644
--- a/lib/decompress_unzstd.c
+++ b/lib/decompress_unzstd.c
@@ -68,11 +68,7 @@
 #ifdef STATIC
 # define UNZSTD_PREBOOT
 # include "xxhash.c"
-# include "zstd/entropy_common.c"
-# include "zstd/fse_decompress.c"
-# include "zstd/huf_decompress.c"
-# include "zstd/zstd_common.c"
-# include "zstd/decompress.c"
+# include "zstd/decompress_sources.h"
 #endif
 
 #include 
diff --git a/lib/zstd/decompress_sources.h b/lib/zstd/decompress_sources.h
new file mode 100644
index ..ccb4960ea0cd
--- /dev/null
+++ b/lib/zstd/decompress_sources.h
@@ -0,0 +1,14 @@
+// SPDX-License-Identifier: GPL-2.0-only
+
+/*
+ * This file includes every .c file needed for decompression.
+ * It is used by lib/decompress_unzstd.c to include the decompression
+ * source into the translation-unit, so it can be used for kernel
+ * decompression.
+ */
+
+#include "entropy_common.c"
+#include "fse_decompress.c"
+#include "huf_decompress.c"
+#include "zstd_common.c"
+#include "decompress.c"
-- 
2.28.0



Re: PROBLEM: zstd bzImage decompression fails for some x86_32 config on 5.9-rc1

2020-09-28 Thread Nick Terrell


> On Sep 28, 2020, at 11:02 AM, Nick Terrell  wrote:
> 
> 
> 
>> On Sep 28, 2020, at 1:55 AM, Feng Tang  wrote:
>> 
>> Hi Nick,
>> 
>> 0day has found some kernel decomprssion failure case since 5.9-rc1 (X86_32
>> build), and it could be related with ZSTD code, though initially we bisected
>> to some other commits.
>> 
>> The error messages are: 
>>  
>>  early console in setup code
>>  Wrong EFI loader signature.
>>  early console in extract_kernel
>>  input_data: 0x046f50b4
>>  input_len: 0x01ebbeb6
>>  output: 0x0100
>>  output_len: 0x04fc535c
>>  kernel_total_size: 0x055f5000
>>  needed_size: 0x055f5000
>>  
>>  Decompressing Linux...
>>  
>>  ZSTD-compressed data is corrupt
>> 
>> This could be reproduced by compiling the kernel with attached config,
>> and use QEMU to boot it.
>> 
>> We suspect it could be related with the kernel size, as we only see
>> it on big kernel, and some more info are:
>> 
>> * If we remove a lot of kernel config to build a much smaller kernel,
>> it will boot fine
>> 
>> * If we change the zstd algorithm from zstd22 to zstd19, the kernel will
>> boot fine with below patch
>> 
>>  diff --git a/arch/x86/boot/compressed/Makefile 
>> b/arch/x86/boot/compressed/Makefile
>>  index 3962f59..8fe71ba 100644
>>  --- a/arch/x86/boot/compressed/Makefile
>>  +++ b/arch/x86/boot/compressed/Makefile
>>  @@ -147,7 +147,7 @@ $(obj)/vmlinux.bin.lzo: $(vmlinux.bin.all-y) FORCE
>>   $(obj)/vmlinux.bin.zst: $(vmlinux.bin.all-y) FORCE
>>  -   $(call if_changed,zstd22)
>>  +   $(call if_changed,zstd)
>> 
>> 
>> Please let me know if you need more info, and sorry for the late report
>> as we just tracked down to this point.
> 
> Thanks for the report, I will look into it today.

CC: Petr Malat

I’ve successfully reproduced, and found the issue. It turns out that this
patch [0] from Petr Malat fixes the issue. As I mentioned in that thread, his
fix corresponds to this upstream commit [1].

Can we get Petr's patch merged into v5.9?

This bug only happens when the window size is > 8 MB. A non-kernel workaround
would be to compress the kernel level 19 instead of level 22, which uses an
8 MB window size, instead of a 128 MB window size.

The reason it only shows up for large kernels, is that the code is only buggy
when an offset > 8 MB is used, so a kernel <= 8 MB can't trigger the bug.

Best,
Nick

[0] https://lkml.org/lkml/2020/9/14/94
[1] 
https://github.com/facebook/zstd/commit/8a5c0c98ae5a7884694589d7a69bc99011add94d

> Best,
> Nick
> 
>> Thanks,
>> Feng
>> 
>> 
>> 
>> 



Re: PROBLEM: zstd bzImage decompression fails for some x86_32 config on 5.9-rc1

2020-09-28 Thread Nick Terrell



> On Sep 28, 2020, at 1:55 AM, Feng Tang  wrote:
> 
> Hi Nick,
> 
> 0day has found some kernel decomprssion failure case since 5.9-rc1 (X86_32
> build), and it could be related with ZSTD code, though initially we bisected
> to some other commits.
> 
> The error messages are: 
>   
>   early console in setup code
>   Wrong EFI loader signature.
>   early console in extract_kernel
>   input_data: 0x046f50b4
>   input_len: 0x01ebbeb6
>   output: 0x0100
>   output_len: 0x04fc535c
>   kernel_total_size: 0x055f5000
>   needed_size: 0x055f5000
>   
>   Decompressing Linux...
>   
>   ZSTD-compressed data is corrupt
> 
> This could be reproduced by compiling the kernel with attached config,
> and use QEMU to boot it.
> 
> We suspect it could be related with the kernel size, as we only see
> it on big kernel, and some more info are:
> 
> * If we remove a lot of kernel config to build a much smaller kernel,
>  it will boot fine
> 
> * If we change the zstd algorithm from zstd22 to zstd19, the kernel will
>  boot fine with below patch
> 
>   diff --git a/arch/x86/boot/compressed/Makefile 
> b/arch/x86/boot/compressed/Makefile
>   index 3962f59..8fe71ba 100644
>   --- a/arch/x86/boot/compressed/Makefile
>   +++ b/arch/x86/boot/compressed/Makefile
>   @@ -147,7 +147,7 @@ $(obj)/vmlinux.bin.lzo: $(vmlinux.bin.all-y) FORCE
>$(obj)/vmlinux.bin.zst: $(vmlinux.bin.all-y) FORCE
>   -   $(call if_changed,zstd22)
>   +   $(call if_changed,zstd)
> 
> 
> Please let me know if you need more info, and sorry for the late report
> as we just tracked down to this point.

Thanks for the report, I will look into it today.

Best,
Nick

> Thanks,
> Feng
> 
> 
> 
> 



Re: [PATCH v3 3/9] lib: zstd: Upgrade to latest upstream zstd version 1.4.6

2020-09-23 Thread Nick Terrell
On Wed, Sep 23, 2020 at 7:28 PM kernel test robot  wrote:
>
> Hi Nick,
>
> Thank you for the patch! Yet something to improve:
>
> [auto build test ERROR on kdave/for-next]
> [also build test ERROR on f2fs/dev-test linus/master v5.9-rc6 next-20200923]
> [cannot apply to cryptodev/master crypto/master]
> [If your patch is applied to the wrong git tree, kindly drop us a note.
> And when submitting patch, we suggest to use '--base' as documented in
> https://git-scm.com/docs/git-format-patch]
>
> url:
> https://github.com/0day-ci/linux/commits/Nick-Terrell/Update-to-zstd-1-4-6/20200924-064102
> base:   https://git.kernel.org/pub/scm/linux/kernel/git/kdave/linux.git 
> for-next
> config: h8300-randconfig-p002-20200923 (attached as .config)
> compiler: h8300-linux-gcc (GCC) 9.3.0
> reproduce (this is a W=1 build):
> wget 
> https://raw.githubusercontent.com/intel/lkp-tests/master/sbin/make.cross -O 
> ~/bin/make.cross
> chmod +x ~/bin/make.cross
> # save the attached .config to linux build tree
> COMPILER_INSTALL_PATH=$HOME/0day COMPILER=gcc-9.3.0 make.cross 
> ARCH=h8300
>
> If you fix the issue, kindly add following tag as appropriate
> Reported-by: kernel test robot 
>
> All errors (new ones prefixed by >>):
>
>h8300-linux-ld: lib/zstd/common/entropy_common.o: in function `MEM_swap32':
> >> lib/zstd/common/mem.h:179: undefined reference to `__bswapsi2'
> >> h8300-linux-ld: lib/zstd/common/mem.h:179: undefined reference to 
> >> `__bswapsi2'
> >> h8300-linux-ld: lib/zstd/common/mem.h:179: undefined reference to 
> >> `__bswapsi2'
> >> h8300-linux-ld: lib/zstd/common/mem.h:179: undefined reference to 
> >> `__bswapsi2'
>h8300-linux-ld: lib/zstd/common/fse_decompress.o: in function `MEM_swap32':
> >> lib/zstd/common/mem.h:179: undefined reference to `__bswapsi2'
>h8300-linux-ld: 
> lib/zstd/common/fse_decompress.o:lib/zstd/common/mem.h:179: more undefined 
> references to `__bswapsi2' follow
>h8300-linux-ld: lib/zstd/compress/zstd_compress.o: in function 
> `MEM_swap64':
> >> lib/zstd/compress/../common/mem.h:192: undefined reference to `__bswapdi2'
>h8300-linux-ld: lib/zstd/compress/zstd_compress.o: in function 
> `MEM_swap32':
> >> lib/zstd/compress/../common/mem.h:179: undefined reference to `__bswapsi2'
> >> h8300-linux-ld: lib/zstd/compress/../common/mem.h:179: undefined reference 
> >> to `__bswapsi2'
> >> h8300-linux-ld: lib/zstd/compress/../common/mem.h:179: undefined reference 
> >> to `__bswapsi2'
> >> h8300-linux-ld: lib/zstd/compress/../common/mem.h:179: undefined reference 
> >> to `__bswapsi2'
> >> h8300-linux-ld: lib/zstd/compress/../common/mem.h:179: undefined reference 
> >> to `__bswapsi2'
>h8300-linux-ld: 
> lib/zstd/compress/zstd_compress.o:lib/zstd/compress/../common/mem.h:179: more 
> undefined references to `__bswapsi2' follow
>h8300-linux-ld: lib/zstd/compress/zstd_double_fast.o: in function 
> `MEM_swap64':
> >> lib/zstd/compress/../common/mem.h:192: undefined reference to `__bswapdi2'
> >> h8300-linux-ld: lib/zstd/compress/../common/mem.h:192: undefined reference 
> >> to `__bswapdi2'
> >> h8300-linux-ld: lib/zstd/compress/../common/mem.h:192: undefined reference 
> >> to `__bswapdi2'
> >> h8300-linux-ld: lib/zstd/compress/../common/mem.h:192: undefined reference 
> >> to `__bswapdi2'
> >> h8300-linux-ld: lib/zstd/compress/../common/mem.h:192: undefined reference 
> >> to `__bswapdi2'
>h8300-linux-ld: 
> lib/zstd/compress/zstd_double_fast.o:lib/zstd/compress/../common/mem.h:192: 
> more undefined references to `__bswapdi2' follow
>h8300-linux-ld: lib/zstd/compress/zstd_opt.o: in function `MEM_swap32':
> >> lib/zstd/compress/../common/mem.h:179: undefined reference to `__bswapsi2'
> >> h8300-linux-ld: lib/zstd/compress/../common/mem.h:179: undefined reference 
> >> to `__bswapsi2'
>h8300-linux-ld: lib/zstd/compress/zstd_opt.o: in function `MEM_swap64':
> >> lib/zstd/compress/../common/mem.h:192: undefined reference to `__bswapdi2'
> >> h8300-linux-ld: lib/zstd/compress/../common/mem.h:192: undefined reference 
> >> to `__bswapdi2'
>h8300-linux-ld: lib/zstd/compress/../common/mem.h:192: undefined reference 
> to `__bswapdi2'
>h8300-linux-ld: lib/zstd/compress/../common/mem.h:192: undefined reference 
> to `__bswapdi2'
>h8300-linux-ld: lib/zstd/compress/../common/mem.h:192: undefined reference 
> to `__bswapdi2'
>h8300-linux-ld: 
> lib/zstd/compress/zstd_opt.o:lib/zstd/compress/../common/mem.h:192: more 
> undefin

[PATCH v3 7/9] squashfs: zstd: Switch to the zstd-1.4.6 API

2020-09-23 Thread Nick Terrell
From: Nick Terrell 

Move away from the compatibility wrapper to the zstd-1.4.6 API. This
code is functionally equivalent.

Signed-off-by: Nick Terrell 
---
 fs/squashfs/zstd_wrapper.c | 9 -
 1 file changed, 4 insertions(+), 5 deletions(-)

diff --git a/fs/squashfs/zstd_wrapper.c b/fs/squashfs/zstd_wrapper.c
index f8c512a6204e..add582409866 100644
--- a/fs/squashfs/zstd_wrapper.c
+++ b/fs/squashfs/zstd_wrapper.c
@@ -11,7 +11,7 @@
 #include 
 #include 
 #include 
-#include 
+#include 
 #include 
 
 #include "squashfs_fs.h"
@@ -34,7 +34,7 @@ static void *zstd_init(struct squashfs_sb_info *msblk, void 
*buff)
goto failed;
wksp->window_size = max_t(size_t,
msblk->block_size, SQUASHFS_METADATA_SIZE);
-   wksp->mem_size = ZSTD_DStreamWorkspaceBound(wksp->window_size);
+   wksp->mem_size = ZSTD_estimateDStreamSize(wksp->window_size);
wksp->mem = vmalloc(wksp->mem_size);
if (wksp->mem == NULL)
goto failed;
@@ -71,7 +71,7 @@ static int zstd_uncompress(struct squashfs_sb_info *msblk, 
void *strm,
struct bvec_iter_all iter_all = {};
struct bio_vec *bvec = bvec_init_iter_all(_all);
 
-   stream = ZSTD_initDStream(wksp->window_size, wksp->mem, wksp->mem_size);
+   stream = ZSTD_initStaticDStream(wksp->mem, wksp->mem_size);
 
if (!stream) {
ERROR("Failed to initialize zstd decompressor\n");
@@ -122,8 +122,7 @@ static int zstd_uncompress(struct squashfs_sb_info *msblk, 
void *strm,
break;
 
if (ZSTD_isError(zstd_err)) {
-   ERROR("zstd decompression error: %d\n",
-   (int)ZSTD_getErrorCode(zstd_err));
+   ERROR("zstd decompression error: %s\n", 
ZSTD_getErrorName(zstd_err));
error = -EIO;
break;
}
-- 
2.28.0



[PATCH v3 6/9] f2fs: zstd: Switch to the zstd-1.4.6 API

2020-09-23 Thread Nick Terrell
From: Nick Terrell 

Move away from the compatibility wrapper to the zstd-1.4.6 API. This
code is more efficient because it uses the single-pass API instead of
the streaming API. The streaming API is not necessary because the whole
input and output buffers are available. This saves memory because we
don't need to allocate a buffer for the window. It is also more
efficient because it saves unnecessary memcpy calls.

Compression memory increases from 168 KB to 204 KB because upstream
uses slightly more memory. Decompression memory decreases from 1.4 MB
to 158 KB.

Signed-off-by: Nick Terrell 
---
 fs/f2fs/compress.c | 102 +
 1 file changed, 38 insertions(+), 64 deletions(-)

diff --git a/fs/f2fs/compress.c b/fs/f2fs/compress.c
index e056f3a2b404..b79efce81651 100644
--- a/fs/f2fs/compress.c
+++ b/fs/f2fs/compress.c
@@ -11,7 +11,8 @@
 #include 
 #include 
 #include 
-#include 
+#include 
+#include 
 
 #include "f2fs.h"
 #include "node.h"
@@ -298,21 +299,21 @@ static const struct f2fs_compress_ops f2fs_lz4_ops = {
 static int zstd_init_compress_ctx(struct compress_ctx *cc)
 {
ZSTD_parameters params;
-   ZSTD_CStream *stream;
+   ZSTD_CCtx *ctx;
void *workspace;
unsigned int workspace_size;
 
params = ZSTD_getParams(F2FS_ZSTD_DEFAULT_CLEVEL, cc->rlen, 0);
-   workspace_size = ZSTD_CStreamWorkspaceBound(params.cParams);
+   workspace_size = ZSTD_estimateCCtxSize_usingCParams(params.cParams);
 
workspace = f2fs_kvmalloc(F2FS_I_SB(cc->inode),
workspace_size, GFP_NOFS);
if (!workspace)
return -ENOMEM;
 
-   stream = ZSTD_initCStream(params, 0, workspace, workspace_size);
-   if (!stream) {
-   printk_ratelimited("%sF2FS-fs (%s): %s ZSTD_initCStream 
failed\n",
+   ctx = ZSTD_initStaticCCtx(workspace, workspace_size);
+   if (!ctx) {
+   printk_ratelimited("%sF2FS-fs (%s): %s ZSTD_inittaticCStream 
failed\n",
KERN_ERR, F2FS_I_SB(cc->inode)->sb->s_id,
__func__);
kvfree(workspace);
@@ -320,7 +321,7 @@ static int zstd_init_compress_ctx(struct compress_ctx *cc)
}
 
cc->private = workspace;
-   cc->private2 = stream;
+   cc->private2 = ctx;
 
cc->clen = cc->rlen - PAGE_SIZE - COMPRESS_HEADER_SIZE;
return 0;
@@ -335,65 +336,48 @@ static void zstd_destroy_compress_ctx(struct compress_ctx 
*cc)
 
 static int zstd_compress_pages(struct compress_ctx *cc)
 {
-   ZSTD_CStream *stream = cc->private2;
-   ZSTD_inBuffer inbuf;
-   ZSTD_outBuffer outbuf;
-   int src_size = cc->rlen;
-   int dst_size = src_size - PAGE_SIZE - COMPRESS_HEADER_SIZE;
-   int ret;
-
-   inbuf.pos = 0;
-   inbuf.src = cc->rbuf;
-   inbuf.size = src_size;
-
-   outbuf.pos = 0;
-   outbuf.dst = cc->cbuf->cdata;
-   outbuf.size = dst_size;
-
-   ret = ZSTD_compressStream(stream, , );
-   if (ZSTD_isError(ret)) {
-   printk_ratelimited("%sF2FS-fs (%s): %s ZSTD_compressStream 
failed, ret: %d\n",
-   KERN_ERR, F2FS_I_SB(cc->inode)->sb->s_id,
-   __func__, ZSTD_getErrorCode(ret));
-   return -EIO;
-   }
-
-   ret = ZSTD_endStream(stream, );
+   ZSTD_CCtx *ctx = cc->private2;
+   const size_t src_size = cc->rlen;
+   const size_t dst_size = src_size - PAGE_SIZE - COMPRESS_HEADER_SIZE;
+   ZSTD_parameters params = ZSTD_getParams(F2FS_ZSTD_DEFAULT_CLEVEL, 
src_size, 0);
+   size_t ret;
+
+   ret = ZSTD_compress_advanced(
+   ctx, cc->cbuf->cdata, dst_size, cc->rbuf, src_size, 
NULL, 0, params);
if (ZSTD_isError(ret)) {
-   printk_ratelimited("%sF2FS-fs (%s): %s ZSTD_endStream returned 
%d\n",
+   /*
+* there is compressed data remained in intermediate buffer due 
to
+* no more space in cbuf.cdata
+*/
+   if (ZSTD_getErrorCode(ret) == ZSTD_error_dstSize_tooSmall)
+   return -EAGAIN;
+   /* other compression errors return -EIO */
+   printk_ratelimited("%sF2FS-fs (%s): %s ZSTD_compress_advanced 
failed, err: %s\n",
KERN_ERR, F2FS_I_SB(cc->inode)->sb->s_id,
-   __func__, ZSTD_getErrorCode(ret));
+   __func__, ZSTD_getErrorName(ret));
return -EIO;
}
 
-   /*
-* there is compressed data remained in intermediate buffer due to
-* no more space in cbuf.cdata
-*/
-   if (ret)
-   return -EAGAIN;
-
-   cc->clen = outbuf.

[PATCH v3 9/9] lib: zstd: Remove zstd compatibility wrapper

2020-09-23 Thread Nick Terrell
From: Nick Terrell 

All callers have been transitioned to the new zstd-1.4.6 API. There are
no more callers of the zstd compatibility wrapper, so delete it.

Signed-off-by: Nick Terrell 
---
 include/linux/zstd_compat.h | 116 
 1 file changed, 116 deletions(-)
 delete mode 100644 include/linux/zstd_compat.h

diff --git a/include/linux/zstd_compat.h b/include/linux/zstd_compat.h
deleted file mode 100644
index cda9208bf04a..
--- a/include/linux/zstd_compat.h
+++ /dev/null
@@ -1,116 +0,0 @@
-/*
- * Copyright (c) 2016-present, Facebook, Inc.
- * All rights reserved.
- *
- * This source code is licensed under the BSD-style license found in the
- * LICENSE file in the root directory of https://github.com/facebook/zstd.
- * An additional grant of patent rights can be found in the PATENTS file in the
- * same directory.
- *
- * This program is free software; you can redistribute it and/or modify it 
under
- * the terms of the GNU General Public License version 2 as published by the
- * Free Software Foundation. This program is dual-licensed; you may select
- * either version 2 of the GNU General Public License ("GPL") or BSD license
- * ("BSD").
- */
-
-#ifndef ZSTD_COMPAT_H
-#define ZSTD_COMPAT_H
-
-#include 
-
-#if defined(ZSTD_VERSION_NUMBER) && (ZSTD_VERSION_NUMBER >= 10406)
-/*
- * This header provides backwards compatibility for the zstd-1.4.6 library
- * upgrade. This header allows us to upgrade the zstd library version without
- * modifying any callers. Then we will migrate callers from the compatibility
- * wrapper one at a time until none remain. At which point we will delete this
- * header.
- *
- * It is temporary and will be deleted once the upgrade is complete.
- */
-
-#include 
-
-static inline size_t ZSTD_CCtxWorkspaceBound(ZSTD_compressionParameters 
compression_params)
-{
-return ZSTD_estimateCCtxSize_usingCParams(compression_params);
-}
-
-static inline size_t ZSTD_CStreamWorkspaceBound(ZSTD_compressionParameters 
compression_params)
-{
-return ZSTD_estimateCStreamSize_usingCParams(compression_params);
-}
-
-static inline size_t ZSTD_DCtxWorkspaceBound(void)
-{
-return ZSTD_estimateDCtxSize();
-}
-
-static inline size_t ZSTD_DStreamWorkspaceBound(unsigned long long window_size)
-{
-return ZSTD_estimateDStreamSize(window_size);
-}
-
-static inline ZSTD_CCtx* ZSTD_initCCtx(void* wksp, size_t wksp_size)
-{
-if (wksp == NULL)
-return NULL;
-return ZSTD_initStaticCCtx(wksp, wksp_size);
-}
-
-static inline ZSTD_CStream* ZSTD_initCStream_compat(ZSTD_parameters params, 
uint64_t pledged_src_size, void* wksp, size_t wksp_size)
-{
-ZSTD_CStream* cstream;
-size_t ret;
-
-if (wksp == NULL)
-return NULL;
-
-cstream = ZSTD_initStaticCStream(wksp, wksp_size);
-if (cstream == NULL)
-return NULL;
-
-/* 0 means unknown in old API but means 0 in new API */
-if (pledged_src_size == 0)
-pledged_src_size = ZSTD_CONTENTSIZE_UNKNOWN;
-
-ret = ZSTD_initCStream_advanced(cstream, NULL, 0, params, 
pledged_src_size);
-if (ZSTD_isError(ret))
-return NULL;
-
-return cstream;
-}
-#define ZSTD_initCStream ZSTD_initCStream_compat
-
-static inline ZSTD_DCtx* ZSTD_initDCtx(void* wksp, size_t wksp_size)
-{
-if (wksp == NULL)
-return NULL;
-return ZSTD_initStaticDCtx(wksp, wksp_size);
-}
-
-static inline ZSTD_DStream* ZSTD_initDStream_compat(unsigned long long 
window_size, void* wksp, size_t wksp_size)
-{
-if (wksp == NULL)
-return NULL;
-(void)window_size;
-return ZSTD_initStaticDStream(wksp, wksp_size);
-}
-#define ZSTD_initDStream ZSTD_initDStream_compat
-
-typedef ZSTD_frameHeader ZSTD_frameParams;
-
-static inline size_t ZSTD_getFrameParams(ZSTD_frameParams* frame_params, const 
void* src, size_t src_size)
-{
-return ZSTD_getFrameHeader(frame_params, src, src_size);
-}
-
-static inline size_t ZSTD_compressCCtx_compat(ZSTD_CCtx* cctx, void* dst, 
size_t dst_capacity, const void* src, size_t src_size, ZSTD_parameters params)
-{
-return ZSTD_compress_advanced(cctx, dst, dst_capacity, src, src_size, 
NULL, 0, params);
-}
-#define ZSTD_compressCCtx ZSTD_compressCCtx_compat
-
-#endif /* ZSTD_VERSION_NUMBER >= 10406 */
-#endif /* ZSTD_COMPAT_H */
-- 
2.28.0



[PATCH v3 8/9] lib: unzstd: Switch to the zstd-1.4.6 API

2020-09-23 Thread Nick Terrell
From: Nick Terrell 

Move away from the compatibility wrapper to the zstd-1.4.6 API. This
code is functionally equivalent.

Signed-off-by: Nick Terrell 
---
 lib/decompress_unzstd.c | 40 ++--
 1 file changed, 14 insertions(+), 26 deletions(-)

diff --git a/lib/decompress_unzstd.c b/lib/decompress_unzstd.c
index a79f705f236d..d4685df0e120 100644
--- a/lib/decompress_unzstd.c
+++ b/lib/decompress_unzstd.c
@@ -73,7 +73,8 @@
 
 #include 
 #include 
-#include 
+#include 
+#include 
 
 /* 128MB is the maximum window size supported by zstd. */
 #define ZSTD_WINDOWSIZE_MAX(1 << ZSTD_WINDOWLOG_MAX)
@@ -120,9 +121,9 @@ static int INIT decompress_single(const u8 *in_buf, long 
in_len, u8 *out_buf,
  long out_len, long *in_pos,
  void (*error)(char *x))
 {
-   const size_t wksp_size = ZSTD_DCtxWorkspaceBound();
+   const size_t wksp_size = ZSTD_estimateDCtxSize();
void *wksp = large_malloc(wksp_size);
-   ZSTD_DCtx *dctx = ZSTD_initDCtx(wksp, wksp_size);
+   ZSTD_DCtx *dctx = ZSTD_initStaticDCtx(wksp, wksp_size);
int err;
size_t ret;
 
@@ -165,7 +166,6 @@ static int INIT __unzstd(unsigned char *in_buf, long in_len,
 {
ZSTD_inBuffer in;
ZSTD_outBuffer out;
-   ZSTD_frameParams params;
void *in_allocated = NULL;
void *out_allocated = NULL;
void *wksp = NULL;
@@ -229,36 +229,24 @@ static int INIT __unzstd(unsigned char *in_buf, long 
in_len,
out.size = out_len;
 
/*
-* We need to know the window size to allocate the ZSTD_DStream.
-* Since we are streaming, we need to allocate a buffer for the sliding
-* window. The window size varies from 1 KB to ZSTD_WINDOWSIZE_MAX
-* (8 MB), so it is important to use the actual value so as not to
-* waste memory when it is smaller.
+* Zstd determines the workspace size from the window size written
+* into the frame header. This ensures that we use the minimum value
+* possible, since the window size varies from 1 KB to 
ZSTD_WINDOWSIZE_MAX
+* (1 GB), so it is very important to use the actual value.
 */
-   ret = ZSTD_getFrameParams(, in.src, in.size);
+   wksp_size = ZSTD_estimateDStreamSize_fromFrame(in.src, in.size);
err = handle_zstd_error(ret, error);
if (err)
goto out;
-   if (ret != 0) {
-   error("ZSTD-compressed data has an incomplete frame header");
-   err = -1;
-   goto out;
-   }
-   if (params.windowSize > ZSTD_WINDOWSIZE_MAX) {
-   error("ZSTD-compressed data has too large a window size");
+   wksp = large_malloc(wksp_size);
+   if (wksp == NULL) {
+   error("Out of memory while allocating ZSTD_DStream");
err = -1;
goto out;
}
-
-   /*
-* Allocate the ZSTD_DStream now that we know how much memory is
-* required.
-*/
-   wksp_size = ZSTD_DStreamWorkspaceBound(params.windowSize);
-   wksp = large_malloc(wksp_size);
-   dstream = ZSTD_initDStream(params.windowSize, wksp, wksp_size);
+   dstream = ZSTD_initStaticDStream(wksp, wksp_size);
if (dstream == NULL) {
-   error("Out of memory while allocating ZSTD_DStream");
+   error("ZSTD_initStaticDStream failed");
err = -1;
goto out;
}
-- 
2.28.0



[PATCH v3 5/9] btrfs: zstd: Switch to the zstd-1.4.6 API

2020-09-23 Thread Nick Terrell
From: Nick Terrell 

Move away from the compatibility wrapper to the zstd-1.4.6 API. This
code is functionally equivalent.

Signed-off-by: Nick Terrell 
---
 fs/btrfs/zstd.c | 48 
 1 file changed, 28 insertions(+), 20 deletions(-)

diff --git a/fs/btrfs/zstd.c b/fs/btrfs/zstd.c
index a7367ff573d4..6b466e090cd7 100644
--- a/fs/btrfs/zstd.c
+++ b/fs/btrfs/zstd.c
@@ -16,7 +16,7 @@
 #include 
 #include 
 #include 
-#include 
+#include 
 #include "misc.h"
 #include "compression.h"
 #include "ctree.h"
@@ -159,8 +159,8 @@ static void zstd_calc_ws_mem_sizes(void)
zstd_get_btrfs_parameters(level, ZSTD_BTRFS_MAX_INPUT);
size_t level_size =
max_t(size_t,
- ZSTD_CStreamWorkspaceBound(params.cParams),
- ZSTD_DStreamWorkspaceBound(ZSTD_BTRFS_MAX_INPUT));
+ 
ZSTD_estimateCStreamSize_usingCParams(params.cParams),
+ ZSTD_estimateDStreamSize(ZSTD_BTRFS_MAX_INPUT));
 
max_size = max_t(size_t, max_size, level_size);
zstd_ws_mem_sizes[level - 1] = max_size;
@@ -389,13 +389,23 @@ int zstd_compress_pages(struct list_head *ws, struct 
address_space *mapping,
*total_in = 0;
 
/* Initialize the stream */
-   stream = ZSTD_initCStream(params, len, workspace->mem,
-   workspace->size);
+   stream = ZSTD_initStaticCStream(workspace->mem, workspace->size);
if (!stream) {
-   pr_warn("BTRFS: ZSTD_initCStream failed\n");
+   pr_warn("BTRFS: ZSTD_initStaticCStream failed\n");
ret = -EIO;
goto out;
}
+   {
+   size_t ret2;
+
+   ret2 = ZSTD_initCStream_advanced(stream, NULL, 0, params, len);
+   if (ZSTD_isError(ret2)) {
+   pr_warn("BTRFS: ZSTD_initCStream_advanced returned 
%s\n",
+   ZSTD_getErrorName(ret2));
+   ret = -EIO;
+   goto out;
+   }
+   }
 
/* map in the first page of input data */
in_page = find_get_page(mapping, start >> PAGE_SHIFT);
@@ -421,8 +431,8 @@ int zstd_compress_pages(struct list_head *ws, struct 
address_space *mapping,
ret2 = ZSTD_compressStream(stream, >out_buf,
>in_buf);
if (ZSTD_isError(ret2)) {
-   pr_debug("BTRFS: ZSTD_compressStream returned %d\n",
-   ZSTD_getErrorCode(ret2));
+   pr_debug("BTRFS: ZSTD_compressStream returned %s\n",
+   ZSTD_getErrorName(ret2));
ret = -EIO;
goto out;
}
@@ -489,8 +499,8 @@ int zstd_compress_pages(struct list_head *ws, struct 
address_space *mapping,
 
ret2 = ZSTD_endStream(stream, >out_buf);
if (ZSTD_isError(ret2)) {
-   pr_debug("BTRFS: ZSTD_endStream returned %d\n",
-   ZSTD_getErrorCode(ret2));
+   pr_debug("BTRFS: ZSTD_endStream returned %s\n",
+   ZSTD_getErrorName(ret2));
ret = -EIO;
goto out;
}
@@ -557,10 +567,9 @@ int zstd_decompress_bio(struct list_head *ws, struct 
compressed_bio *cb)
unsigned long buf_start;
unsigned long total_out = 0;
 
-   stream = ZSTD_initDStream(
-   ZSTD_BTRFS_MAX_INPUT, workspace->mem, workspace->size);
+   stream = ZSTD_initStaticDStream(workspace->mem, workspace->size);
if (!stream) {
-   pr_debug("BTRFS: ZSTD_initDStream failed\n");
+   pr_debug("BTRFS: ZSTD_initStaticDStream failed\n");
ret = -EIO;
goto done;
}
@@ -579,8 +588,8 @@ int zstd_decompress_bio(struct list_head *ws, struct 
compressed_bio *cb)
ret2 = ZSTD_decompressStream(stream, >out_buf,
>in_buf);
if (ZSTD_isError(ret2)) {
-   pr_debug("BTRFS: ZSTD_decompressStream returned %d\n",
-   ZSTD_getErrorCode(ret2));
+   pr_debug("BTRFS: ZSTD_decompressStream returned %s\n",
+   ZSTD_getErrorName(ret2));
ret = -EIO;
goto done;
}
@@ -633,10 +642,9 @@ int zstd_decompress(struct list_head *ws, unsigned char 
*data_in,
unsigned long pg_offset = 0;
char *kaddr;
 
-

[PATCH v3 4/9] crypto: zstd: Switch to zstd-1.4.6 API

2020-09-23 Thread Nick Terrell
From: Nick Terrell 

Move away from the compatibility wrapper to the zstd-1.4.6 API. This
code is functionally equivalent.

Signed-off-by: Nick Terrell 
---
 crypto/zstd.c | 24 +++-
 1 file changed, 11 insertions(+), 13 deletions(-)

diff --git a/crypto/zstd.c b/crypto/zstd.c
index dcda3cad3b5c..767fe2fbe009 100644
--- a/crypto/zstd.c
+++ b/crypto/zstd.c
@@ -11,7 +11,7 @@
 #include 
 #include 
 #include 
-#include 
+#include 
 #include 
 
 
@@ -24,16 +24,15 @@ struct zstd_ctx {
void *dwksp;
 };
 
-static ZSTD_parameters zstd_params(void)
-{
-   return ZSTD_getParams(ZSTD_DEF_LEVEL, 0, 0);
-}
-
 static int zstd_comp_init(struct zstd_ctx *ctx)
 {
int ret = 0;
-   const ZSTD_parameters params = zstd_params();
-   const size_t wksp_size = ZSTD_CCtxWorkspaceBound(params.cParams);
+   const size_t wksp_size = ZSTD_estimateCCtxSize(ZSTD_DEF_LEVEL);
+
+   if (ZSTD_isError(wksp_size)) {
+   ret = -EINVAL;
+   goto out_free;
+   }
 
ctx->cwksp = vzalloc(wksp_size);
if (!ctx->cwksp) {
@@ -41,7 +40,7 @@ static int zstd_comp_init(struct zstd_ctx *ctx)
goto out;
}
 
-   ctx->cctx = ZSTD_initCCtx(ctx->cwksp, wksp_size);
+   ctx->cctx = ZSTD_initStaticCCtx(ctx->cwksp, wksp_size);
if (!ctx->cctx) {
ret = -EINVAL;
goto out_free;
@@ -56,7 +55,7 @@ static int zstd_comp_init(struct zstd_ctx *ctx)
 static int zstd_decomp_init(struct zstd_ctx *ctx)
 {
int ret = 0;
-   const size_t wksp_size = ZSTD_DCtxWorkspaceBound();
+   const size_t wksp_size = ZSTD_estimateDCtxSize();
 
ctx->dwksp = vzalloc(wksp_size);
if (!ctx->dwksp) {
@@ -64,7 +63,7 @@ static int zstd_decomp_init(struct zstd_ctx *ctx)
goto out;
}
 
-   ctx->dctx = ZSTD_initDCtx(ctx->dwksp, wksp_size);
+   ctx->dctx = ZSTD_initStaticDCtx(ctx->dwksp, wksp_size);
if (!ctx->dctx) {
ret = -EINVAL;
goto out_free;
@@ -152,9 +151,8 @@ static int __zstd_compress(const u8 *src, unsigned int slen,
 {
size_t out_len;
struct zstd_ctx *zctx = ctx;
-   const ZSTD_parameters params = zstd_params();
 
-   out_len = ZSTD_compressCCtx(zctx->cctx, dst, *dlen, src, slen, params);
+   out_len = ZSTD_compressCCtx(zctx->cctx, dst, *dlen, src, slen, 
ZSTD_DEF_LEVEL);
if (ZSTD_isError(out_len))
return -EINVAL;
*dlen = out_len;
-- 
2.28.0



[PATCH v3 2/9] lib: zstd: Add decompress_sources.h for decompress_unzstd

2020-09-23 Thread Nick Terrell
From: Nick Terrell 

Adds decompress_sources.h which includes every .c file necessary for
zstd decompression. This is used in decompress_unzstd.c so the internal
structure of the library isn't exposed.

This allows us to upgrade the zstd library version without modifying any
callers. Instead we just need to update decompress_sources.h.

Signed-off-by: Nick Terrell 
---
 lib/decompress_unzstd.c   |  6 +-
 lib/zstd/decompress_sources.h | 14 ++
 2 files changed, 15 insertions(+), 5 deletions(-)
 create mode 100644 lib/zstd/decompress_sources.h

diff --git a/lib/decompress_unzstd.c b/lib/decompress_unzstd.c
index dbc290af26b4..a79f705f236d 100644
--- a/lib/decompress_unzstd.c
+++ b/lib/decompress_unzstd.c
@@ -68,11 +68,7 @@
 #ifdef STATIC
 # define UNZSTD_PREBOOT
 # include "xxhash.c"
-# include "zstd/entropy_common.c"
-# include "zstd/fse_decompress.c"
-# include "zstd/huf_decompress.c"
-# include "zstd/zstd_common.c"
-# include "zstd/decompress.c"
+# include "zstd/decompress_sources.h"
 #endif
 
 #include 
diff --git a/lib/zstd/decompress_sources.h b/lib/zstd/decompress_sources.h
new file mode 100644
index ..ccb4960ea0cd
--- /dev/null
+++ b/lib/zstd/decompress_sources.h
@@ -0,0 +1,14 @@
+// SPDX-License-Identifier: GPL-2.0-only
+
+/*
+ * This file includes every .c file needed for decompression.
+ * It is used by lib/decompress_unzstd.c to include the decompression
+ * source into the translation-unit, so it can be used for kernel
+ * decompression.
+ */
+
+#include "entropy_common.c"
+#include "fse_decompress.c"
+#include "huf_decompress.c"
+#include "zstd_common.c"
+#include "decompress.c"
-- 
2.28.0



[PATCH v3 1/9] lib: zstd: Add zstd compatibility wrapper

2020-09-23 Thread Nick Terrell
From: Nick Terrell 

Adds zstd_compat.h which provides the necessary functions from the
current zstd.h API. It is only active for zstd versions 1.4.6 and newer.
That means it is disabled currently, but will become active when a later
patch in this series updates the zstd library in the kernel to 1.4.6.

This header allows the zstd upgrade to 1.4.6 without changing any
callers, since they all include zstd through the compatibility wrapper.
Later patches in this series transition each caller away from the
compatibility wrapper. After all the callers have been transitioned away
from the compatibility wrapper, the final patch in this series deletes
it.

Signed-off-by: Nick Terrell 
---
 crypto/zstd.c   |   2 +-
 fs/btrfs/zstd.c |   2 +-
 fs/f2fs/compress.c  |   2 +-
 fs/squashfs/zstd_wrapper.c  |   2 +-
 include/linux/zstd_compat.h | 116 
 lib/decompress_unzstd.c |   2 +-
 6 files changed, 121 insertions(+), 5 deletions(-)
 create mode 100644 include/linux/zstd_compat.h

diff --git a/crypto/zstd.c b/crypto/zstd.c
index 1a3309f066f7..dcda3cad3b5c 100644
--- a/crypto/zstd.c
+++ b/crypto/zstd.c
@@ -11,7 +11,7 @@
 #include 
 #include 
 #include 
-#include 
+#include 
 #include 
 
 
diff --git a/fs/btrfs/zstd.c b/fs/btrfs/zstd.c
index 9a4871636c6c..a7367ff573d4 100644
--- a/fs/btrfs/zstd.c
+++ b/fs/btrfs/zstd.c
@@ -16,7 +16,7 @@
 #include 
 #include 
 #include 
-#include 
+#include 
 #include "misc.h"
 #include "compression.h"
 #include "ctree.h"
diff --git a/fs/f2fs/compress.c b/fs/f2fs/compress.c
index 1dfb126a0cb2..e056f3a2b404 100644
--- a/fs/f2fs/compress.c
+++ b/fs/f2fs/compress.c
@@ -11,7 +11,7 @@
 #include 
 #include 
 #include 
-#include 
+#include 
 
 #include "f2fs.h"
 #include "node.h"
diff --git a/fs/squashfs/zstd_wrapper.c b/fs/squashfs/zstd_wrapper.c
index b7cb1faa652d..f8c512a6204e 100644
--- a/fs/squashfs/zstd_wrapper.c
+++ b/fs/squashfs/zstd_wrapper.c
@@ -11,7 +11,7 @@
 #include 
 #include 
 #include 
-#include 
+#include 
 #include 
 
 #include "squashfs_fs.h"
diff --git a/include/linux/zstd_compat.h b/include/linux/zstd_compat.h
new file mode 100644
index ..cda9208bf04a
--- /dev/null
+++ b/include/linux/zstd_compat.h
@@ -0,0 +1,116 @@
+/*
+ * Copyright (c) 2016-present, Facebook, Inc.
+ * All rights reserved.
+ *
+ * This source code is licensed under the BSD-style license found in the
+ * LICENSE file in the root directory of https://github.com/facebook/zstd.
+ * An additional grant of patent rights can be found in the PATENTS file in the
+ * same directory.
+ *
+ * This program is free software; you can redistribute it and/or modify it 
under
+ * the terms of the GNU General Public License version 2 as published by the
+ * Free Software Foundation. This program is dual-licensed; you may select
+ * either version 2 of the GNU General Public License ("GPL") or BSD license
+ * ("BSD").
+ */
+
+#ifndef ZSTD_COMPAT_H
+#define ZSTD_COMPAT_H
+
+#include 
+
+#if defined(ZSTD_VERSION_NUMBER) && (ZSTD_VERSION_NUMBER >= 10406)
+/*
+ * This header provides backwards compatibility for the zstd-1.4.6 library
+ * upgrade. This header allows us to upgrade the zstd library version without
+ * modifying any callers. Then we will migrate callers from the compatibility
+ * wrapper one at a time until none remain. At which point we will delete this
+ * header.
+ *
+ * It is temporary and will be deleted once the upgrade is complete.
+ */
+
+#include 
+
+static inline size_t ZSTD_CCtxWorkspaceBound(ZSTD_compressionParameters 
compression_params)
+{
+return ZSTD_estimateCCtxSize_usingCParams(compression_params);
+}
+
+static inline size_t ZSTD_CStreamWorkspaceBound(ZSTD_compressionParameters 
compression_params)
+{
+return ZSTD_estimateCStreamSize_usingCParams(compression_params);
+}
+
+static inline size_t ZSTD_DCtxWorkspaceBound(void)
+{
+return ZSTD_estimateDCtxSize();
+}
+
+static inline size_t ZSTD_DStreamWorkspaceBound(unsigned long long window_size)
+{
+return ZSTD_estimateDStreamSize(window_size);
+}
+
+static inline ZSTD_CCtx* ZSTD_initCCtx(void* wksp, size_t wksp_size)
+{
+if (wksp == NULL)
+return NULL;
+return ZSTD_initStaticCCtx(wksp, wksp_size);
+}
+
+static inline ZSTD_CStream* ZSTD_initCStream_compat(ZSTD_parameters params, 
uint64_t pledged_src_size, void* wksp, size_t wksp_size)
+{
+ZSTD_CStream* cstream;
+size_t ret;
+
+if (wksp == NULL)
+return NULL;
+
+cstream = ZSTD_initStaticCStream(wksp, wksp_size);
+if (cstream == NULL)
+return NULL;
+
+/* 0 means unknown in old API but means 0 in new API */
+if (pledged_src_size == 0)
+pledged_src_size = ZSTD_CONTENTSIZE_UNKNOWN;
+
+ret = ZSTD_initCStream_advanced(cstream, NULL, 0, params, 
pledged_src_size);
+if (ZSTD_isError(ret))
+return NULL;
+
+return cstream;
+}
+#define ZS

[PATCH v3 0/9] Update to zstd-1.4.6

2020-09-23 Thread Nick Terrell
From: Nick Terrell 

This patchset upgrades the zstd library to the latest upstream release. The
current zstd version in the kernel is a modified version of upstream zstd-1.3.1.
At the time it was integrated, zstd wasn't ready to be used in the kernel as-is.
But, it is now possible to use upstream zstd directly in the kernel.

I have not yet release zstd-1.4.6 upstream. I want the zstd version in the 
kernel
to match up with a known upstream release, so we know exactly what code is
running. Whenever this patchset is ready for merge, I will cut a release at the
upstream commit that gets merged. This should not be necessary for future
releases.

The kernel zstd library is automatically generated from upstream zstd. A script
makes the necessary changes and imports it into the kernel. The changes are:

1. Replace all libc dependencies with kernel replacements and rewrite includes.
2. Remove unncessary portability macros like: #if defined(_MSC_VER).
3. Use the kernel xxhash instead of bundling it.

This automation gets tested every commit by upstream's continuous integration.
When we cut a new zstd release, we will submit a patch to the kernel to update
the zstd version in the kernel.

I've updated zstd to upstream with one big patch because every commit must 
build,
so that precludes partial updates. Since the commit is 100% generated, I hope 
the
review burden is lightened. I considered replaying upstream commits, but that is
not possible because there have been ~3500 upstream commits since the last zstd
import, and the commits don't all build individually. The bulk update preserves
bisectablity because bugs can be bisected to the zstd version update. At that
point the update can be reverted, and we can work with upstream to find and fix
the bug. After this big switch in how the kernel consumes zstd, future patches
will be smaller, because they will only have one upstream release worth of
changes each.

This patchset changes the zstd API from a custom kernel API to the upstream API.
I considered wrapping the upstream API with a wrapper that is closer to the
kernel style guide. Following advise from https://lkml.org/lkml/2020/9/17/814
I've chosen to use the upstream API directly, to minimize opportunities to
introduce bugs, and because using the upstream API directly makes debugging and
communication with upstream easier.

This patchset comes in 3 parts:
1. The first 2 patches prepare for the zstd upgrade. The first patch adds a
   compatibility wrapper so zstd can be upgraded without modifying any callers.
   The second patch adds an indirection for the lib/decompress_unzstd.c 
including
   of all decompression source files.
2. Import zstd-1.4.6. This patch is completely generated from upstream using
   automated tooling.
3. Update all callers to the zstd-1.4.6 API then delete the compatibility
   wrapper.

I tested every caller of zstd on x86_64. I tested both after the 1.4.6 upgrade
using the compatibility wrapper, and after the final patch in this series. 

I tested kernel and initramfs decompression in i386 and arm.

I ran benchmarks to compare the current zstd in the kernel with zstd-1.4.6.
I benchmarked on x86_64 using QEMU with KVM enabled on an Intel i9-9900k.
I found:
* BtrFS zstd compression at levels 1 and 3 is 5% faster
* BtrFS zstd decompression+read is 15% faster
* SquashFS zstd decompression+read is 15% faster
* F2FS zstd compression+write at level 3 is 8% faster
* F2FS zstd decompression+read is 20% faster
* ZRAM decompression+read is 30% faster
* Kernel zstd decompression is 35% faster
* Initramfs zstd decompression+build is 5% faster

The latest zstd also offers bug fixes and a 1 KB reduction in stack uage during
compression.

Please let me know if there is anything that I can do to ease the way for these
patches. I think it is important because it gets large performance improvements,
contains bug fixes, and is switching to a more maintainable model of consuming
upstream zstd directly, making it easy to keep up to date.

Best,
Nick Terrell

v1 -> v2:
* Successfully tested F2FS with help from Chao Yu to fix my test.
* (1/9) Fix ZSTD_initCStream() wrapper to handle pledged_src_size=0 means 
unknown.
  This fixes F2FS with the zstd-1.4.6 compatibility wrapper, exposed by the 
test.

v2 -> v3:
* (3/9) Silence warnings by Kernel Test Robot:
  https://github.com/facebook/zstd/pull/2324
  Stack size warnings remain, but these aren't new, and the functions it warns 
on
  are either unused or not in the maximum stack path. This patchset reduces zstd
  compression stack usage by 1 KB overall. I've gotten the low hanging fruit, 
and
  more stack reduction would require significant changes that have the potential
  to introduce new bugs. However, I do hope to continue to reduce zstd stack
  usage in future versions.

Nick Terrell (9):
  lib: zstd: Add zstd compatibility wrapper
  lib: zstd: Add decompress_sources.h for decompress_unzstd
  lib: zstd: Upgrade to latest upstream zstd version

[PATCH v2 4/9] crypto: zstd: Switch to zstd-1.4.6 API

2020-09-22 Thread Nick Terrell
From: Nick Terrell 

Move away from the compatibility wrapper to the zstd-1.4.6 API. This
code is functionally equivalent.

Signed-off-by: Nick Terrell 
---
 crypto/zstd.c | 24 +++-
 1 file changed, 11 insertions(+), 13 deletions(-)

diff --git a/crypto/zstd.c b/crypto/zstd.c
index dcda3cad3b5c..767fe2fbe009 100644
--- a/crypto/zstd.c
+++ b/crypto/zstd.c
@@ -11,7 +11,7 @@
 #include 
 #include 
 #include 
-#include 
+#include 
 #include 
 
 
@@ -24,16 +24,15 @@ struct zstd_ctx {
void *dwksp;
 };
 
-static ZSTD_parameters zstd_params(void)
-{
-   return ZSTD_getParams(ZSTD_DEF_LEVEL, 0, 0);
-}
-
 static int zstd_comp_init(struct zstd_ctx *ctx)
 {
int ret = 0;
-   const ZSTD_parameters params = zstd_params();
-   const size_t wksp_size = ZSTD_CCtxWorkspaceBound(params.cParams);
+   const size_t wksp_size = ZSTD_estimateCCtxSize(ZSTD_DEF_LEVEL);
+
+   if (ZSTD_isError(wksp_size)) {
+   ret = -EINVAL;
+   goto out_free;
+   }
 
ctx->cwksp = vzalloc(wksp_size);
if (!ctx->cwksp) {
@@ -41,7 +40,7 @@ static int zstd_comp_init(struct zstd_ctx *ctx)
goto out;
}
 
-   ctx->cctx = ZSTD_initCCtx(ctx->cwksp, wksp_size);
+   ctx->cctx = ZSTD_initStaticCCtx(ctx->cwksp, wksp_size);
if (!ctx->cctx) {
ret = -EINVAL;
goto out_free;
@@ -56,7 +55,7 @@ static int zstd_comp_init(struct zstd_ctx *ctx)
 static int zstd_decomp_init(struct zstd_ctx *ctx)
 {
int ret = 0;
-   const size_t wksp_size = ZSTD_DCtxWorkspaceBound();
+   const size_t wksp_size = ZSTD_estimateDCtxSize();
 
ctx->dwksp = vzalloc(wksp_size);
if (!ctx->dwksp) {
@@ -64,7 +63,7 @@ static int zstd_decomp_init(struct zstd_ctx *ctx)
goto out;
}
 
-   ctx->dctx = ZSTD_initDCtx(ctx->dwksp, wksp_size);
+   ctx->dctx = ZSTD_initStaticDCtx(ctx->dwksp, wksp_size);
if (!ctx->dctx) {
ret = -EINVAL;
goto out_free;
@@ -152,9 +151,8 @@ static int __zstd_compress(const u8 *src, unsigned int slen,
 {
size_t out_len;
struct zstd_ctx *zctx = ctx;
-   const ZSTD_parameters params = zstd_params();
 
-   out_len = ZSTD_compressCCtx(zctx->cctx, dst, *dlen, src, slen, params);
+   out_len = ZSTD_compressCCtx(zctx->cctx, dst, *dlen, src, slen, 
ZSTD_DEF_LEVEL);
if (ZSTD_isError(out_len))
return -EINVAL;
*dlen = out_len;
-- 
2.28.0



[PATCH v2 7/9] squashfs: zstd: Switch to the zstd-1.4.6 API

2020-09-22 Thread Nick Terrell
From: Nick Terrell 

Move away from the compatibility wrapper to the zstd-1.4.6 API. This
code is functionally equivalent.

Signed-off-by: Nick Terrell 
---
 fs/squashfs/zstd_wrapper.c | 9 -
 1 file changed, 4 insertions(+), 5 deletions(-)

diff --git a/fs/squashfs/zstd_wrapper.c b/fs/squashfs/zstd_wrapper.c
index f8c512a6204e..add582409866 100644
--- a/fs/squashfs/zstd_wrapper.c
+++ b/fs/squashfs/zstd_wrapper.c
@@ -11,7 +11,7 @@
 #include 
 #include 
 #include 
-#include 
+#include 
 #include 
 
 #include "squashfs_fs.h"
@@ -34,7 +34,7 @@ static void *zstd_init(struct squashfs_sb_info *msblk, void 
*buff)
goto failed;
wksp->window_size = max_t(size_t,
msblk->block_size, SQUASHFS_METADATA_SIZE);
-   wksp->mem_size = ZSTD_DStreamWorkspaceBound(wksp->window_size);
+   wksp->mem_size = ZSTD_estimateDStreamSize(wksp->window_size);
wksp->mem = vmalloc(wksp->mem_size);
if (wksp->mem == NULL)
goto failed;
@@ -71,7 +71,7 @@ static int zstd_uncompress(struct squashfs_sb_info *msblk, 
void *strm,
struct bvec_iter_all iter_all = {};
struct bio_vec *bvec = bvec_init_iter_all(_all);
 
-   stream = ZSTD_initDStream(wksp->window_size, wksp->mem, wksp->mem_size);
+   stream = ZSTD_initStaticDStream(wksp->mem, wksp->mem_size);
 
if (!stream) {
ERROR("Failed to initialize zstd decompressor\n");
@@ -122,8 +122,7 @@ static int zstd_uncompress(struct squashfs_sb_info *msblk, 
void *strm,
break;
 
if (ZSTD_isError(zstd_err)) {
-   ERROR("zstd decompression error: %d\n",
-   (int)ZSTD_getErrorCode(zstd_err));
+   ERROR("zstd decompression error: %s\n", 
ZSTD_getErrorName(zstd_err));
error = -EIO;
break;
}
-- 
2.28.0



[PATCH v2 9/9] lib: zstd: Remove zstd compatibility wrapper

2020-09-22 Thread Nick Terrell
From: Nick Terrell 

All callers have been transitioned to the new zstd-1.4.6 API. There are
no more callers of the zstd compatibility wrapper, so delete it.

Signed-off-by: Nick Terrell 
---
 include/linux/zstd_compat.h | 116 
 1 file changed, 116 deletions(-)
 delete mode 100644 include/linux/zstd_compat.h

diff --git a/include/linux/zstd_compat.h b/include/linux/zstd_compat.h
deleted file mode 100644
index cda9208bf04a..
--- a/include/linux/zstd_compat.h
+++ /dev/null
@@ -1,116 +0,0 @@
-/*
- * Copyright (c) 2016-present, Facebook, Inc.
- * All rights reserved.
- *
- * This source code is licensed under the BSD-style license found in the
- * LICENSE file in the root directory of https://github.com/facebook/zstd.
- * An additional grant of patent rights can be found in the PATENTS file in the
- * same directory.
- *
- * This program is free software; you can redistribute it and/or modify it 
under
- * the terms of the GNU General Public License version 2 as published by the
- * Free Software Foundation. This program is dual-licensed; you may select
- * either version 2 of the GNU General Public License ("GPL") or BSD license
- * ("BSD").
- */
-
-#ifndef ZSTD_COMPAT_H
-#define ZSTD_COMPAT_H
-
-#include 
-
-#if defined(ZSTD_VERSION_NUMBER) && (ZSTD_VERSION_NUMBER >= 10406)
-/*
- * This header provides backwards compatibility for the zstd-1.4.6 library
- * upgrade. This header allows us to upgrade the zstd library version without
- * modifying any callers. Then we will migrate callers from the compatibility
- * wrapper one at a time until none remain. At which point we will delete this
- * header.
- *
- * It is temporary and will be deleted once the upgrade is complete.
- */
-
-#include 
-
-static inline size_t ZSTD_CCtxWorkspaceBound(ZSTD_compressionParameters 
compression_params)
-{
-return ZSTD_estimateCCtxSize_usingCParams(compression_params);
-}
-
-static inline size_t ZSTD_CStreamWorkspaceBound(ZSTD_compressionParameters 
compression_params)
-{
-return ZSTD_estimateCStreamSize_usingCParams(compression_params);
-}
-
-static inline size_t ZSTD_DCtxWorkspaceBound(void)
-{
-return ZSTD_estimateDCtxSize();
-}
-
-static inline size_t ZSTD_DStreamWorkspaceBound(unsigned long long window_size)
-{
-return ZSTD_estimateDStreamSize(window_size);
-}
-
-static inline ZSTD_CCtx* ZSTD_initCCtx(void* wksp, size_t wksp_size)
-{
-if (wksp == NULL)
-return NULL;
-return ZSTD_initStaticCCtx(wksp, wksp_size);
-}
-
-static inline ZSTD_CStream* ZSTD_initCStream_compat(ZSTD_parameters params, 
uint64_t pledged_src_size, void* wksp, size_t wksp_size)
-{
-ZSTD_CStream* cstream;
-size_t ret;
-
-if (wksp == NULL)
-return NULL;
-
-cstream = ZSTD_initStaticCStream(wksp, wksp_size);
-if (cstream == NULL)
-return NULL;
-
-/* 0 means unknown in old API but means 0 in new API */
-if (pledged_src_size == 0)
-pledged_src_size = ZSTD_CONTENTSIZE_UNKNOWN;
-
-ret = ZSTD_initCStream_advanced(cstream, NULL, 0, params, 
pledged_src_size);
-if (ZSTD_isError(ret))
-return NULL;
-
-return cstream;
-}
-#define ZSTD_initCStream ZSTD_initCStream_compat
-
-static inline ZSTD_DCtx* ZSTD_initDCtx(void* wksp, size_t wksp_size)
-{
-if (wksp == NULL)
-return NULL;
-return ZSTD_initStaticDCtx(wksp, wksp_size);
-}
-
-static inline ZSTD_DStream* ZSTD_initDStream_compat(unsigned long long 
window_size, void* wksp, size_t wksp_size)
-{
-if (wksp == NULL)
-return NULL;
-(void)window_size;
-return ZSTD_initStaticDStream(wksp, wksp_size);
-}
-#define ZSTD_initDStream ZSTD_initDStream_compat
-
-typedef ZSTD_frameHeader ZSTD_frameParams;
-
-static inline size_t ZSTD_getFrameParams(ZSTD_frameParams* frame_params, const 
void* src, size_t src_size)
-{
-return ZSTD_getFrameHeader(frame_params, src, src_size);
-}
-
-static inline size_t ZSTD_compressCCtx_compat(ZSTD_CCtx* cctx, void* dst, 
size_t dst_capacity, const void* src, size_t src_size, ZSTD_parameters params)
-{
-return ZSTD_compress_advanced(cctx, dst, dst_capacity, src, src_size, 
NULL, 0, params);
-}
-#define ZSTD_compressCCtx ZSTD_compressCCtx_compat
-
-#endif /* ZSTD_VERSION_NUMBER >= 10406 */
-#endif /* ZSTD_COMPAT_H */
-- 
2.28.0



[PATCH v2 6/9] f2fs: zstd: Switch to the zstd-1.4.6 API

2020-09-22 Thread Nick Terrell
From: Nick Terrell 

Move away from the compatibility wrapper to the zstd-1.4.6 API. This
code is more efficient because it uses the single-pass API instead of
the streaming API. The streaming API is not necessary because the whole
input and output buffers are available. This saves memory because we
don't need to allocate a buffer for the window. It is also more
efficient because it saves unnecessary memcpy calls.

Compression memory increases from 168 KB to 204 KB because upstream
uses slightly more memory. Decompression memory decreases from 1.4 MB
to 158 KB.

Signed-off-by: Nick Terrell 
---
 fs/f2fs/compress.c | 102 +
 1 file changed, 38 insertions(+), 64 deletions(-)

diff --git a/fs/f2fs/compress.c b/fs/f2fs/compress.c
index e056f3a2b404..b79efce81651 100644
--- a/fs/f2fs/compress.c
+++ b/fs/f2fs/compress.c
@@ -11,7 +11,8 @@
 #include 
 #include 
 #include 
-#include 
+#include 
+#include 
 
 #include "f2fs.h"
 #include "node.h"
@@ -298,21 +299,21 @@ static const struct f2fs_compress_ops f2fs_lz4_ops = {
 static int zstd_init_compress_ctx(struct compress_ctx *cc)
 {
ZSTD_parameters params;
-   ZSTD_CStream *stream;
+   ZSTD_CCtx *ctx;
void *workspace;
unsigned int workspace_size;
 
params = ZSTD_getParams(F2FS_ZSTD_DEFAULT_CLEVEL, cc->rlen, 0);
-   workspace_size = ZSTD_CStreamWorkspaceBound(params.cParams);
+   workspace_size = ZSTD_estimateCCtxSize_usingCParams(params.cParams);
 
workspace = f2fs_kvmalloc(F2FS_I_SB(cc->inode),
workspace_size, GFP_NOFS);
if (!workspace)
return -ENOMEM;
 
-   stream = ZSTD_initCStream(params, 0, workspace, workspace_size);
-   if (!stream) {
-   printk_ratelimited("%sF2FS-fs (%s): %s ZSTD_initCStream 
failed\n",
+   ctx = ZSTD_initStaticCCtx(workspace, workspace_size);
+   if (!ctx) {
+   printk_ratelimited("%sF2FS-fs (%s): %s ZSTD_inittaticCStream 
failed\n",
KERN_ERR, F2FS_I_SB(cc->inode)->sb->s_id,
__func__);
kvfree(workspace);
@@ -320,7 +321,7 @@ static int zstd_init_compress_ctx(struct compress_ctx *cc)
}
 
cc->private = workspace;
-   cc->private2 = stream;
+   cc->private2 = ctx;
 
cc->clen = cc->rlen - PAGE_SIZE - COMPRESS_HEADER_SIZE;
return 0;
@@ -335,65 +336,48 @@ static void zstd_destroy_compress_ctx(struct compress_ctx 
*cc)
 
 static int zstd_compress_pages(struct compress_ctx *cc)
 {
-   ZSTD_CStream *stream = cc->private2;
-   ZSTD_inBuffer inbuf;
-   ZSTD_outBuffer outbuf;
-   int src_size = cc->rlen;
-   int dst_size = src_size - PAGE_SIZE - COMPRESS_HEADER_SIZE;
-   int ret;
-
-   inbuf.pos = 0;
-   inbuf.src = cc->rbuf;
-   inbuf.size = src_size;
-
-   outbuf.pos = 0;
-   outbuf.dst = cc->cbuf->cdata;
-   outbuf.size = dst_size;
-
-   ret = ZSTD_compressStream(stream, , );
-   if (ZSTD_isError(ret)) {
-   printk_ratelimited("%sF2FS-fs (%s): %s ZSTD_compressStream 
failed, ret: %d\n",
-   KERN_ERR, F2FS_I_SB(cc->inode)->sb->s_id,
-   __func__, ZSTD_getErrorCode(ret));
-   return -EIO;
-   }
-
-   ret = ZSTD_endStream(stream, );
+   ZSTD_CCtx *ctx = cc->private2;
+   const size_t src_size = cc->rlen;
+   const size_t dst_size = src_size - PAGE_SIZE - COMPRESS_HEADER_SIZE;
+   ZSTD_parameters params = ZSTD_getParams(F2FS_ZSTD_DEFAULT_CLEVEL, 
src_size, 0);
+   size_t ret;
+
+   ret = ZSTD_compress_advanced(
+   ctx, cc->cbuf->cdata, dst_size, cc->rbuf, src_size, 
NULL, 0, params);
if (ZSTD_isError(ret)) {
-   printk_ratelimited("%sF2FS-fs (%s): %s ZSTD_endStream returned 
%d\n",
+   /*
+* there is compressed data remained in intermediate buffer due 
to
+* no more space in cbuf.cdata
+*/
+   if (ZSTD_getErrorCode(ret) == ZSTD_error_dstSize_tooSmall)
+   return -EAGAIN;
+   /* other compression errors return -EIO */
+   printk_ratelimited("%sF2FS-fs (%s): %s ZSTD_compress_advanced 
failed, err: %s\n",
KERN_ERR, F2FS_I_SB(cc->inode)->sb->s_id,
-   __func__, ZSTD_getErrorCode(ret));
+   __func__, ZSTD_getErrorName(ret));
return -EIO;
}
 
-   /*
-* there is compressed data remained in intermediate buffer due to
-* no more space in cbuf.cdata
-*/
-   if (ret)
-   return -EAGAIN;
-
-   cc->clen = outbuf.

[PATCH v2 8/9] lib: unzstd: Switch to the zstd-1.4.6 API

2020-09-22 Thread Nick Terrell
From: Nick Terrell 

Move away from the compatibility wrapper to the zstd-1.4.6 API. This
code is functionally equivalent.

Signed-off-by: Nick Terrell 
---
 lib/decompress_unzstd.c | 40 ++--
 1 file changed, 14 insertions(+), 26 deletions(-)

diff --git a/lib/decompress_unzstd.c b/lib/decompress_unzstd.c
index a79f705f236d..d4685df0e120 100644
--- a/lib/decompress_unzstd.c
+++ b/lib/decompress_unzstd.c
@@ -73,7 +73,8 @@
 
 #include 
 #include 
-#include 
+#include 
+#include 
 
 /* 128MB is the maximum window size supported by zstd. */
 #define ZSTD_WINDOWSIZE_MAX(1 << ZSTD_WINDOWLOG_MAX)
@@ -120,9 +121,9 @@ static int INIT decompress_single(const u8 *in_buf, long 
in_len, u8 *out_buf,
  long out_len, long *in_pos,
  void (*error)(char *x))
 {
-   const size_t wksp_size = ZSTD_DCtxWorkspaceBound();
+   const size_t wksp_size = ZSTD_estimateDCtxSize();
void *wksp = large_malloc(wksp_size);
-   ZSTD_DCtx *dctx = ZSTD_initDCtx(wksp, wksp_size);
+   ZSTD_DCtx *dctx = ZSTD_initStaticDCtx(wksp, wksp_size);
int err;
size_t ret;
 
@@ -165,7 +166,6 @@ static int INIT __unzstd(unsigned char *in_buf, long in_len,
 {
ZSTD_inBuffer in;
ZSTD_outBuffer out;
-   ZSTD_frameParams params;
void *in_allocated = NULL;
void *out_allocated = NULL;
void *wksp = NULL;
@@ -229,36 +229,24 @@ static int INIT __unzstd(unsigned char *in_buf, long 
in_len,
out.size = out_len;
 
/*
-* We need to know the window size to allocate the ZSTD_DStream.
-* Since we are streaming, we need to allocate a buffer for the sliding
-* window. The window size varies from 1 KB to ZSTD_WINDOWSIZE_MAX
-* (8 MB), so it is important to use the actual value so as not to
-* waste memory when it is smaller.
+* Zstd determines the workspace size from the window size written
+* into the frame header. This ensures that we use the minimum value
+* possible, since the window size varies from 1 KB to 
ZSTD_WINDOWSIZE_MAX
+* (1 GB), so it is very important to use the actual value.
 */
-   ret = ZSTD_getFrameParams(, in.src, in.size);
+   wksp_size = ZSTD_estimateDStreamSize_fromFrame(in.src, in.size);
err = handle_zstd_error(ret, error);
if (err)
goto out;
-   if (ret != 0) {
-   error("ZSTD-compressed data has an incomplete frame header");
-   err = -1;
-   goto out;
-   }
-   if (params.windowSize > ZSTD_WINDOWSIZE_MAX) {
-   error("ZSTD-compressed data has too large a window size");
+   wksp = large_malloc(wksp_size);
+   if (wksp == NULL) {
+   error("Out of memory while allocating ZSTD_DStream");
err = -1;
goto out;
}
-
-   /*
-* Allocate the ZSTD_DStream now that we know how much memory is
-* required.
-*/
-   wksp_size = ZSTD_DStreamWorkspaceBound(params.windowSize);
-   wksp = large_malloc(wksp_size);
-   dstream = ZSTD_initDStream(params.windowSize, wksp, wksp_size);
+   dstream = ZSTD_initStaticDStream(wksp, wksp_size);
if (dstream == NULL) {
-   error("Out of memory while allocating ZSTD_DStream");
+   error("ZSTD_initStaticDStream failed");
err = -1;
goto out;
}
-- 
2.28.0



[PATCH v2 5/9] btrfs: zstd: Switch to the zstd-1.4.6 API

2020-09-22 Thread Nick Terrell
From: Nick Terrell 

Move away from the compatibility wrapper to the zstd-1.4.6 API. This
code is functionally equivalent.

Signed-off-by: Nick Terrell 
---
 fs/btrfs/zstd.c | 48 
 1 file changed, 28 insertions(+), 20 deletions(-)

diff --git a/fs/btrfs/zstd.c b/fs/btrfs/zstd.c
index a7367ff573d4..6b466e090cd7 100644
--- a/fs/btrfs/zstd.c
+++ b/fs/btrfs/zstd.c
@@ -16,7 +16,7 @@
 #include 
 #include 
 #include 
-#include 
+#include 
 #include "misc.h"
 #include "compression.h"
 #include "ctree.h"
@@ -159,8 +159,8 @@ static void zstd_calc_ws_mem_sizes(void)
zstd_get_btrfs_parameters(level, ZSTD_BTRFS_MAX_INPUT);
size_t level_size =
max_t(size_t,
- ZSTD_CStreamWorkspaceBound(params.cParams),
- ZSTD_DStreamWorkspaceBound(ZSTD_BTRFS_MAX_INPUT));
+ 
ZSTD_estimateCStreamSize_usingCParams(params.cParams),
+ ZSTD_estimateDStreamSize(ZSTD_BTRFS_MAX_INPUT));
 
max_size = max_t(size_t, max_size, level_size);
zstd_ws_mem_sizes[level - 1] = max_size;
@@ -389,13 +389,23 @@ int zstd_compress_pages(struct list_head *ws, struct 
address_space *mapping,
*total_in = 0;
 
/* Initialize the stream */
-   stream = ZSTD_initCStream(params, len, workspace->mem,
-   workspace->size);
+   stream = ZSTD_initStaticCStream(workspace->mem, workspace->size);
if (!stream) {
-   pr_warn("BTRFS: ZSTD_initCStream failed\n");
+   pr_warn("BTRFS: ZSTD_initStaticCStream failed\n");
ret = -EIO;
goto out;
}
+   {
+   size_t ret2;
+
+   ret2 = ZSTD_initCStream_advanced(stream, NULL, 0, params, len);
+   if (ZSTD_isError(ret2)) {
+   pr_warn("BTRFS: ZSTD_initCStream_advanced returned 
%s\n",
+   ZSTD_getErrorName(ret2));
+   ret = -EIO;
+   goto out;
+   }
+   }
 
/* map in the first page of input data */
in_page = find_get_page(mapping, start >> PAGE_SHIFT);
@@ -421,8 +431,8 @@ int zstd_compress_pages(struct list_head *ws, struct 
address_space *mapping,
ret2 = ZSTD_compressStream(stream, >out_buf,
>in_buf);
if (ZSTD_isError(ret2)) {
-   pr_debug("BTRFS: ZSTD_compressStream returned %d\n",
-   ZSTD_getErrorCode(ret2));
+   pr_debug("BTRFS: ZSTD_compressStream returned %s\n",
+   ZSTD_getErrorName(ret2));
ret = -EIO;
goto out;
}
@@ -489,8 +499,8 @@ int zstd_compress_pages(struct list_head *ws, struct 
address_space *mapping,
 
ret2 = ZSTD_endStream(stream, >out_buf);
if (ZSTD_isError(ret2)) {
-   pr_debug("BTRFS: ZSTD_endStream returned %d\n",
-   ZSTD_getErrorCode(ret2));
+   pr_debug("BTRFS: ZSTD_endStream returned %s\n",
+   ZSTD_getErrorName(ret2));
ret = -EIO;
goto out;
}
@@ -557,10 +567,9 @@ int zstd_decompress_bio(struct list_head *ws, struct 
compressed_bio *cb)
unsigned long buf_start;
unsigned long total_out = 0;
 
-   stream = ZSTD_initDStream(
-   ZSTD_BTRFS_MAX_INPUT, workspace->mem, workspace->size);
+   stream = ZSTD_initStaticDStream(workspace->mem, workspace->size);
if (!stream) {
-   pr_debug("BTRFS: ZSTD_initDStream failed\n");
+   pr_debug("BTRFS: ZSTD_initStaticDStream failed\n");
ret = -EIO;
goto done;
}
@@ -579,8 +588,8 @@ int zstd_decompress_bio(struct list_head *ws, struct 
compressed_bio *cb)
ret2 = ZSTD_decompressStream(stream, >out_buf,
>in_buf);
if (ZSTD_isError(ret2)) {
-   pr_debug("BTRFS: ZSTD_decompressStream returned %d\n",
-   ZSTD_getErrorCode(ret2));
+   pr_debug("BTRFS: ZSTD_decompressStream returned %s\n",
+   ZSTD_getErrorName(ret2));
ret = -EIO;
goto done;
}
@@ -633,10 +642,9 @@ int zstd_decompress(struct list_head *ws, unsigned char 
*data_in,
unsigned long pg_offset = 0;
char *kaddr;
 
-

[PATCH v2 0/9] Update to zstd-1.4.6

2020-09-22 Thread Nick Terrell
From: Nick Terrell 

This patchset upgrades the zstd library to the latest upstream release. The
current zstd version in the kernel is a modified version of upstream zstd-1.3.1.
At the time it was integrated, zstd wasn't ready to be used in the kernel as-is.
But, it is now possible to use upstream zstd directly in the kernel.

I have not yet release zstd-1.4.6 upstream. I want the zstd version in the 
kernel
to match up with a known upstream release, so we know exactly what code is
running. Whenever this patchset is ready for merge, I will cut a release at the
upstream commit that gets merged. This should not be necessary for future
releases.

The kernel zstd library is automatically generated from upstream zstd. A script
makes the necessary changes and imports it into the kernel. The changes are:

1. Replace all libc dependencies with kernel replacements and rewrite includes.
2. Remove unncessary portability macros like: #if defined(_MSC_VER).
3. Use the kernel xxhash instead of bundling it.

This automation gets tested every commit by upstream's continuous integration.
When we cut a new zstd release, we will submit a patch to the kernel to update
the zstd version in the kernel.

I've updated zstd to upstream with one big patch because every commit must 
build,
so that precludes partial updates. Since the commit is 100% generated, I hope 
the
review burden is lightened. I considered replaying upstream commits, but that is
not possible because there have been ~3500 upstream commits since the last zstd
import, and the commits don't all build individually. The bulk update preserves
bisectablity because bugs can be bisected to the zstd version update. At that
point the update can be reverted, and we can work with upstream to find and fix
the bug. After this big switch in how the kernel consumes zstd, future patches
will be smaller, because they will only have one upstream release worth of
changes each.

This patchset comes in 3 parts:
1. The first 2 patches prepare for the zstd upgrade. The first patch adds a
   compatibility wrapper so zstd can be upgraded without modifying any callers.
   The second patch adds an indirection for the lib/decompress_unzstd.c 
including
   of all decompression source files.
2. Import zstd-1.4.6. This patch is completely generated from upstream using
   automated tooling.
3. Update all callers to the zstd-1.4.6 API then delete the compatibility
   wrapper.

I tested every caller of zstd on x86_64. I tested both after the 1.4.6 upgrade
using the compatibility wrapper, and after the final patch in this series. 

I tested kernel and initramfs decompression in i386 and arm.

I ran benchmarks to compare the current zstd in the kernel with zstd-1.4.6.
I benchmarked on x86_64 using QEMU with KVM enabled on an Intel i9-9900k.
I found:
* BtrFS zstd compression at levels 1 and 3 is 5% faster
* BtrFS zstd decompression+read is 15% faster
* SquashFS zstd decompression+read is 15% faster
* F2FS zstd compression+write at level 3 is 8% faster
* F2FS zstd decompression+read is 20% faster
* ZRAM decompression+read is 30% faster
* Kernel zstd decompression is 35% faster
* Initramfs zstd decompression+build is 5% faster

The latest zstd also offers bug fixes and a 1 KB reduction in stack uage during
compression.

Please let me know if there is anything that I can do to ease the way for these
patches. I think it is important because it gets large performance improvements,
contains bug fixes, and is switching to a more maintainable model of consuming
upstream zstd directly, making it easy to keep up to date.

Best,
Nick Terrell

v1 -> v2:
* Successfully tested F2FS with help from Chao Yu to fix my test.
* (1/9) Fix ZSTD_initCStream() wrapper to handle pledged_src_size=0 means 
unknown.
  This fixes F2FS with the zstd-1.4.6 compatibility wrapper, exposed by the 
test.

Nick Terrell (9):
  lib: zstd: Add zstd compatibility wrapper
  lib: zstd: Add decompress_sources.h for decompress_unzstd
  lib: zstd: Upgrade to latest upstream zstd version 1.4.6
  crypto: zstd: Switch to zstd-1.4.6 API
  btrfs: zstd: Switch to the zstd-1.4.6 API
  f2fs: zstd: Switch to the zstd-1.4.6 API
  squashfs: zstd: Switch to the zstd-1.4.6 API
  lib: unzstd: Switch to the zstd-1.4.6 API
  lib: zstd: Remove zstd compatibility wrapper

 crypto/zstd.c |   22 +-
 fs/btrfs/zstd.c   |   46 +-
 fs/f2fs/compress.c|  100 +-
 fs/squashfs/zstd_wrapper.c|7 +-
 include/linux/zstd.h  | 3019 
 include/linux/zstd_errors.h   |   76 +
 lib/decompress_unzstd.c   |   44 +-
 lib/zstd/Makefile |   35 +-
 lib/zstd/bitstream.h  |  379 --
 lib/zstd/common/bitstream.h   |  437 ++
 lib/zstd/common/compiler.h|  134 +
 lib/zstd/common/cpu.h |  194 +
 lib/z

[PATCH v2 1/9] lib: zstd: Add zstd compatibility wrapper

2020-09-22 Thread Nick Terrell
From: Nick Terrell 

Adds zstd_compat.h which provides the necessary functions from the
current zstd.h API. It is only active for zstd versions 1.4.6 and newer.
That means it is disabled currently, but will become active when a later
patch in this series updates the zstd library in the kernel to 1.4.6.

This header allows the zstd upgrade to 1.4.6 without changing any
callers, since they all include zstd through the compatibility wrapper.
Later patches in this series transition each caller away from the
compatibility wrapper. After all the callers have been transitioned away
from the compatibility wrapper, the final patch in this series deletes
it.

Signed-off-by: Nick Terrell 
---
 crypto/zstd.c   |   2 +-
 fs/btrfs/zstd.c |   2 +-
 fs/f2fs/compress.c  |   2 +-
 fs/squashfs/zstd_wrapper.c  |   2 +-
 include/linux/zstd_compat.h | 116 
 lib/decompress_unzstd.c |   2 +-
 6 files changed, 121 insertions(+), 5 deletions(-)
 create mode 100644 include/linux/zstd_compat.h

diff --git a/crypto/zstd.c b/crypto/zstd.c
index 1a3309f066f7..dcda3cad3b5c 100644
--- a/crypto/zstd.c
+++ b/crypto/zstd.c
@@ -11,7 +11,7 @@
 #include 
 #include 
 #include 
-#include 
+#include 
 #include 
 
 
diff --git a/fs/btrfs/zstd.c b/fs/btrfs/zstd.c
index 9a4871636c6c..a7367ff573d4 100644
--- a/fs/btrfs/zstd.c
+++ b/fs/btrfs/zstd.c
@@ -16,7 +16,7 @@
 #include 
 #include 
 #include 
-#include 
+#include 
 #include "misc.h"
 #include "compression.h"
 #include "ctree.h"
diff --git a/fs/f2fs/compress.c b/fs/f2fs/compress.c
index 1dfb126a0cb2..e056f3a2b404 100644
--- a/fs/f2fs/compress.c
+++ b/fs/f2fs/compress.c
@@ -11,7 +11,7 @@
 #include 
 #include 
 #include 
-#include 
+#include 
 
 #include "f2fs.h"
 #include "node.h"
diff --git a/fs/squashfs/zstd_wrapper.c b/fs/squashfs/zstd_wrapper.c
index b7cb1faa652d..f8c512a6204e 100644
--- a/fs/squashfs/zstd_wrapper.c
+++ b/fs/squashfs/zstd_wrapper.c
@@ -11,7 +11,7 @@
 #include 
 #include 
 #include 
-#include 
+#include 
 #include 
 
 #include "squashfs_fs.h"
diff --git a/include/linux/zstd_compat.h b/include/linux/zstd_compat.h
new file mode 100644
index ..cda9208bf04a
--- /dev/null
+++ b/include/linux/zstd_compat.h
@@ -0,0 +1,116 @@
+/*
+ * Copyright (c) 2016-present, Facebook, Inc.
+ * All rights reserved.
+ *
+ * This source code is licensed under the BSD-style license found in the
+ * LICENSE file in the root directory of https://github.com/facebook/zstd.
+ * An additional grant of patent rights can be found in the PATENTS file in the
+ * same directory.
+ *
+ * This program is free software; you can redistribute it and/or modify it 
under
+ * the terms of the GNU General Public License version 2 as published by the
+ * Free Software Foundation. This program is dual-licensed; you may select
+ * either version 2 of the GNU General Public License ("GPL") or BSD license
+ * ("BSD").
+ */
+
+#ifndef ZSTD_COMPAT_H
+#define ZSTD_COMPAT_H
+
+#include 
+
+#if defined(ZSTD_VERSION_NUMBER) && (ZSTD_VERSION_NUMBER >= 10406)
+/*
+ * This header provides backwards compatibility for the zstd-1.4.6 library
+ * upgrade. This header allows us to upgrade the zstd library version without
+ * modifying any callers. Then we will migrate callers from the compatibility
+ * wrapper one at a time until none remain. At which point we will delete this
+ * header.
+ *
+ * It is temporary and will be deleted once the upgrade is complete.
+ */
+
+#include 
+
+static inline size_t ZSTD_CCtxWorkspaceBound(ZSTD_compressionParameters 
compression_params)
+{
+return ZSTD_estimateCCtxSize_usingCParams(compression_params);
+}
+
+static inline size_t ZSTD_CStreamWorkspaceBound(ZSTD_compressionParameters 
compression_params)
+{
+return ZSTD_estimateCStreamSize_usingCParams(compression_params);
+}
+
+static inline size_t ZSTD_DCtxWorkspaceBound(void)
+{
+return ZSTD_estimateDCtxSize();
+}
+
+static inline size_t ZSTD_DStreamWorkspaceBound(unsigned long long window_size)
+{
+return ZSTD_estimateDStreamSize(window_size);
+}
+
+static inline ZSTD_CCtx* ZSTD_initCCtx(void* wksp, size_t wksp_size)
+{
+if (wksp == NULL)
+return NULL;
+return ZSTD_initStaticCCtx(wksp, wksp_size);
+}
+
+static inline ZSTD_CStream* ZSTD_initCStream_compat(ZSTD_parameters params, 
uint64_t pledged_src_size, void* wksp, size_t wksp_size)
+{
+ZSTD_CStream* cstream;
+size_t ret;
+
+if (wksp == NULL)
+return NULL;
+
+cstream = ZSTD_initStaticCStream(wksp, wksp_size);
+if (cstream == NULL)
+return NULL;
+
+/* 0 means unknown in old API but means 0 in new API */
+if (pledged_src_size == 0)
+pledged_src_size = ZSTD_CONTENTSIZE_UNKNOWN;
+
+ret = ZSTD_initCStream_advanced(cstream, NULL, 0, params, 
pledged_src_size);
+if (ZSTD_isError(ret))
+return NULL;
+
+return cstream;
+}
+#define ZS

[PATCH v2 2/9] lib: zstd: Add decompress_sources.h for decompress_unzstd

2020-09-22 Thread Nick Terrell
From: Nick Terrell 

Adds decompress_sources.h which includes every .c file necessary for
zstd decompression. This is used in decompress_unzstd.c so the internal
structure of the library isn't exposed.

This allows us to upgrade the zstd library version without modifying any
callers. Instead we just need to update decompress_sources.h.

Signed-off-by: Nick Terrell 
---
 lib/decompress_unzstd.c   |  6 +-
 lib/zstd/decompress_sources.h | 14 ++
 2 files changed, 15 insertions(+), 5 deletions(-)
 create mode 100644 lib/zstd/decompress_sources.h

diff --git a/lib/decompress_unzstd.c b/lib/decompress_unzstd.c
index dbc290af26b4..a79f705f236d 100644
--- a/lib/decompress_unzstd.c
+++ b/lib/decompress_unzstd.c
@@ -68,11 +68,7 @@
 #ifdef STATIC
 # define UNZSTD_PREBOOT
 # include "xxhash.c"
-# include "zstd/entropy_common.c"
-# include "zstd/fse_decompress.c"
-# include "zstd/huf_decompress.c"
-# include "zstd/zstd_common.c"
-# include "zstd/decompress.c"
+# include "zstd/decompress_sources.h"
 #endif
 
 #include 
diff --git a/lib/zstd/decompress_sources.h b/lib/zstd/decompress_sources.h
new file mode 100644
index ..ccb4960ea0cd
--- /dev/null
+++ b/lib/zstd/decompress_sources.h
@@ -0,0 +1,14 @@
+// SPDX-License-Identifier: GPL-2.0-only
+
+/*
+ * This file includes every .c file needed for decompression.
+ * It is used by lib/decompress_unzstd.c to include the decompression
+ * source into the translation-unit, so it can be used for kernel
+ * decompression.
+ */
+
+#include "entropy_common.c"
+#include "fse_decompress.c"
+#include "huf_decompress.c"
+#include "zstd_common.c"
+#include "decompress.c"
-- 
2.28.0



Re: [PATCH 6/9] f2fs: zstd: Switch to the zstd-1.4.6 API

2020-09-17 Thread Nick Terrell


> On Sep 17, 2020, at 6:47 PM, Chao Yu  wrote:
> 
> On 2020/9/18 3:34, Nick Terrell wrote:
>>> On Sep 17, 2020, at 11:00 AM, Nick Terrell  wrote:
>>> 
>>> 
>>> 
>>>> On Sep 16, 2020, at 11:31 PM, Chao Yu  wrote:
>>>> 
>>>> Hi Nick,
>>>> 
>>>> On 2020/9/17 2:39, Nick Terrell wrote:
>>>>>> On Sep 15, 2020, at 11:31 PM, Chao Yu  wrote:
>>>>>> 
>>>>>> Hi Nick,
>>>>>> 
>>>>>> remove not related mailing list.
>>>>>> 
>>>>>> On 2020/9/16 11:43, Nick Terrell wrote:
>>>>>>> From: Nick Terrell 
>>>>>>> Move away from the compatibility wrapper to the zstd-1.4.6 API. This
>>>>>>> code is more efficient because it uses the single-pass API instead of
>>>>>>> the streaming API. The streaming API is not necessary because the whole
>>>>>>> input and output buffers are available. This saves memory because we
>>>>>>> don't need to allocate a buffer for the window. It is also more
>>>>>>> efficient because it saves unnecessary memcpy calls.
>>>>>>> I've had problems testing this code because I see data truncation before
>>>>>>> and after this patchset. Help testing this patch would be much
>>>>>>> appreciated.
>>>>>> 
>>>>>> Can you please explain more about data truncation? I'm a little 
>>>>>> confused...
>>>>>> 
>>>>>> Do you mean that f2fs doesn't allocate enough memory for zstd 
>>>>>> compression,
>>>>>> so that compression is not finished actually, the compressed data is 
>>>>>> truncated
>>>>>> at dst buffer?
>>>>> Hi Chao,
>>>>> I’ve tested F2FS using a benchmark I adapted from testing BtrFS [0]. It 
>>>>> is possible
>>>>> that the script I’m using is buggy or is exposing an edge case in F2FS. 
>>>>> The files
>>>>> that I copy to F2FS and compress end up truncated with a hole at the end.
>>>> 
>>>> Thanks for your explanation. :)
>>>> 
>>>>> It is based off of upstream commit ab29a807a7.
>>>>> E.g. the end of the copied file looks like this, but the original file 
>>>>> has non-zero data
>>>>> In the end. Until the hole at the end the file is correct.
>>>>> od dickens | tail -n 5
>>>>>> 46667760 067502 066167 020056 040440 020163 023511 006555 060412
>>>>>> 4667 00 00 00 00 00 00 00 00
>>>>>> *
>>>>>> 46703060 00 00 00 00 00 00 00
>>>>>> 46703076
>>>>> [0] https://gist.github.com/terrelln/7dd2919937dfbdb8e839e4ad11c81db4
>>>> 
>>>> Shouldn't we just get sha1 value by flitering sha1sum output?
>>>> 
>>>>   asha=`sha1sum $BENCHMARK_DIR/$file |awk {'print $1'}`
>>>>   bsha=`sha1sum $MP/$i/$file |awk {'print $1'}`
>>> 
>>> Probably, but it was just a quick one-off script.
>> Ah, never mind, you are right.
>>>> I can't reproduce this issue by using simple data sample, could you share
>>>> that 'dickens' file or other smaller-sized sample if you have?
>>> 
>>> The /tmp/silesia directory in the example is populated with all the files 
>>> from
>>> this website. It is a popular data compression benchmark corpus. You can
>>> click on the “total” link to download a zip archive of all the files.
>>> 
>>> https://urldefense.proofpoint.com/v2/url?u=http-3A__sun.aei.polsl.pl_-7Esdeor_index.php-3Fpage-3Dsilesia=DwIDaQ=5VD0RTtNlTh3ycd41b3MUw=HQM5IQdWOB8WaMoii2dYTw=-bYa7TavRodl96xy65hjVIkt5HdMldv4LOCRHJf12n8=mdX82rCzyHO-Q3KGJ5b94mqDKcDh1IWEqEWfuqw7P3I=
>>>  
>>> -Nick
>> I’ve spent some time minimizing the test case. This script [0] is the 
>> minimized
>> test case that doesn’t require any input files, it builds its own.
>> Several observations:
>> * The input file needs to be 7700481 bytes large, smaller files don’t 
>> trigger the bug.
>> * You have to `chattr +c` the file after copying it otherwise the bug 
>> doesn’t occur.
>> * After `chattr +c` you have to unmount and remount the filesystem to 
>> trigger the bug.
>> I’ve reproduced on v5.9-rc5 (856deb866d16e). I’ve also reproduced on m

Re: [PATCH 6/9] f2fs: zstd: Switch to the zstd-1.4.6 API

2020-09-17 Thread Nick Terrell


> On Sep 17, 2020, at 11:00 AM, Nick Terrell  wrote:
> 
> 
> 
>> On Sep 16, 2020, at 11:31 PM, Chao Yu  wrote:
>> 
>> Hi Nick,
>> 
>> On 2020/9/17 2:39, Nick Terrell wrote:
>>>> On Sep 15, 2020, at 11:31 PM, Chao Yu  wrote:
>>>> 
>>>> Hi Nick,
>>>> 
>>>> remove not related mailing list.
>>>> 
>>>> On 2020/9/16 11:43, Nick Terrell wrote:
>>>>> From: Nick Terrell 
>>>>> Move away from the compatibility wrapper to the zstd-1.4.6 API. This
>>>>> code is more efficient because it uses the single-pass API instead of
>>>>> the streaming API. The streaming API is not necessary because the whole
>>>>> input and output buffers are available. This saves memory because we
>>>>> don't need to allocate a buffer for the window. It is also more
>>>>> efficient because it saves unnecessary memcpy calls.
>>>>> I've had problems testing this code because I see data truncation before
>>>>> and after this patchset. Help testing this patch would be much
>>>>> appreciated.
>>>> 
>>>> Can you please explain more about data truncation? I'm a little confused...
>>>> 
>>>> Do you mean that f2fs doesn't allocate enough memory for zstd compression,
>>>> so that compression is not finished actually, the compressed data is 
>>>> truncated
>>>> at dst buffer?
>>> Hi Chao,
>>> I’ve tested F2FS using a benchmark I adapted from testing BtrFS [0]. It is 
>>> possible
>>> that the script I’m using is buggy or is exposing an edge case in F2FS. The 
>>> files
>>> that I copy to F2FS and compress end up truncated with a hole at the end.
>> 
>> Thanks for your explanation. :)
>> 
>>> It is based off of upstream commit ab29a807a7.
>>> E.g. the end of the copied file looks like this, but the original file has 
>>> non-zero data
>>> In the end. Until the hole at the end the file is correct.
>>> od dickens | tail -n 5
>>>> 46667760 067502 066167 020056 040440 020163 023511 006555 060412
>>>> 4667 00 00 00 00 00 00 00 00
>>>> *
>>>> 46703060 00 00 00 00 00 00 00
>>>> 46703076
>>> [0] https://gist.github.com/terrelln/7dd2919937dfbdb8e839e4ad11c81db4
>> 
>> Shouldn't we just get sha1 value by flitering sha1sum output?
>> 
>>   asha=`sha1sum $BENCHMARK_DIR/$file |awk {'print $1'}`
>>   bsha=`sha1sum $MP/$i/$file |awk {'print $1'}`
> 
> Probably, but it was just a quick one-off script.

Ah, never mind, you are right.

>> I can't reproduce this issue by using simple data sample, could you share
>> that 'dickens' file or other smaller-sized sample if you have?
> 
> The /tmp/silesia directory in the example is populated with all the files from
> this website. It is a popular data compression benchmark corpus. You can
> click on the “total” link to download a zip archive of all the files.
> 
> http://sun.aei.polsl.pl/~sdeor/index.php?page=silesia
> 
> -Nick

I’ve spent some time minimizing the test case. This script [0] is the minimized
test case that doesn’t require any input files, it builds its own.

Several observations:
* The input file needs to be 7700481 bytes large, smaller files don’t trigger 
the bug.
* You have to `chattr +c` the file after copying it otherwise the bug doesn’t 
occur.
* After `chattr +c` you have to unmount and remount the filesystem to trigger 
the bug.

I’ve reproduced on v5.9-rc5 (856deb866d16e). I’ve also reproduced on my host 
machine
running 5.8.5-arch1-1.

[0] https://gist.github.com/terrelln/4bba325abdfa3a6f014e9911ac92a185

Best,
Nick

>> Thanks,
>> 
>>> Best,
>>> Nick
>>>> Thanks,
>>>> 
>>>>> Signed-off-by: Nick Terrell 
>>>>> ---
>>>>> fs/f2fs/compress.c | 102 +
>>>>> 1 file changed, 38 insertions(+), 64 deletions(-)
>>>>> diff --git a/fs/f2fs/compress.c b/fs/f2fs/compress.c
>>>>> index e056f3a2b404..b79efce81651 100644
>>>>> --- a/fs/f2fs/compress.c
>>>>> +++ b/fs/f2fs/compress.c
>>>>> @@ -11,7 +11,8 @@
>>>>> #include 
>>>>> #include 
>>>>> #include 
>>>>> -#include 
>>>>> +#include 
>>>>> +#include 
>>>>>   #include "f2fs.h"
>>>>> #include "node.h"
>&g

Re: [PATCH 6/9] f2fs: zstd: Switch to the zstd-1.4.6 API

2020-09-17 Thread Nick Terrell


> On Sep 16, 2020, at 11:31 PM, Chao Yu  wrote:
> 
> Hi Nick,
> 
> On 2020/9/17 2:39, Nick Terrell wrote:
>>> On Sep 15, 2020, at 11:31 PM, Chao Yu  wrote:
>>> 
>>> Hi Nick,
>>> 
>>> remove not related mailing list.
>>> 
>>> On 2020/9/16 11:43, Nick Terrell wrote:
>>>> From: Nick Terrell 
>>>> Move away from the compatibility wrapper to the zstd-1.4.6 API. This
>>>> code is more efficient because it uses the single-pass API instead of
>>>> the streaming API. The streaming API is not necessary because the whole
>>>> input and output buffers are available. This saves memory because we
>>>> don't need to allocate a buffer for the window. It is also more
>>>> efficient because it saves unnecessary memcpy calls.
>>>> I've had problems testing this code because I see data truncation before
>>>> and after this patchset. Help testing this patch would be much
>>>> appreciated.
>>> 
>>> Can you please explain more about data truncation? I'm a little confused...
>>> 
>>> Do you mean that f2fs doesn't allocate enough memory for zstd compression,
>>> so that compression is not finished actually, the compressed data is 
>>> truncated
>>> at dst buffer?
>> Hi Chao,
>> I’ve tested F2FS using a benchmark I adapted from testing BtrFS [0]. It is 
>> possible
>> that the script I’m using is buggy or is exposing an edge case in F2FS. The 
>> files
>> that I copy to F2FS and compress end up truncated with a hole at the end.
> 
> Thanks for your explanation. :)
> 
>> It is based off of upstream commit ab29a807a7.
>> E.g. the end of the copied file looks like this, but the original file has 
>> non-zero data
>> In the end. Until the hole at the end the file is correct.
>> od dickens | tail -n 5
>>> 46667760 067502 066167 020056 040440 020163 023511 006555 060412
>>> 4667 00 00 00 00 00 00 00 00
>>> *
>>> 46703060 00 00 00 00 00 00 00
>>> 46703076
>> [0] https://gist.github.com/terrelln/7dd2919937dfbdb8e839e4ad11c81db4
> 
> Shouldn't we just get sha1 value by flitering sha1sum output?
> 
>asha=`sha1sum $BENCHMARK_DIR/$file |awk {'print $1'}`
>bsha=`sha1sum $MP/$i/$file |awk {'print $1'}`

Probably, but it was just a quick one-off script.

> I can't reproduce this issue by using simple data sample, could you share
> that 'dickens' file or other smaller-sized sample if you have?

The /tmp/silesia directory in the example is populated with all the files from
this website. It is a popular data compression benchmark corpus. You can
click on the “total” link to download a zip archive of all the files.

http://sun.aei.polsl.pl/~sdeor/index.php?page=silesia

-Nick

> Thanks,
> 
>> Best,
>> Nick
>>> Thanks,
>>> 
>>>> Signed-off-by: Nick Terrell 
>>>> ---
>>>>  fs/f2fs/compress.c | 102 +
>>>>  1 file changed, 38 insertions(+), 64 deletions(-)
>>>> diff --git a/fs/f2fs/compress.c b/fs/f2fs/compress.c
>>>> index e056f3a2b404..b79efce81651 100644
>>>> --- a/fs/f2fs/compress.c
>>>> +++ b/fs/f2fs/compress.c
>>>> @@ -11,7 +11,8 @@
>>>>  #include 
>>>>  #include 
>>>>  #include 
>>>> -#include 
>>>> +#include 
>>>> +#include 
>>>>#include "f2fs.h"
>>>>  #include "node.h"
>>>> @@ -298,21 +299,21 @@ static const struct f2fs_compress_ops f2fs_lz4_ops = 
>>>> {
>>>>  static int zstd_init_compress_ctx(struct compress_ctx *cc)
>>>>  {
>>>>ZSTD_parameters params;
>>>> -  ZSTD_CStream *stream;
>>>> +  ZSTD_CCtx *ctx;
>>>>void *workspace;
>>>>unsigned int workspace_size;
>>>>params = ZSTD_getParams(F2FS_ZSTD_DEFAULT_CLEVEL, cc->rlen, 0);
>>>> -  workspace_size = ZSTD_CStreamWorkspaceBound(params.cParams);
>>>> +  workspace_size = ZSTD_estimateCCtxSize_usingCParams(params.cParams);
>>>>workspace = f2fs_kvmalloc(F2FS_I_SB(cc->inode),
>>>>workspace_size, GFP_NOFS);
>>>>if (!workspace)
>>>>return -ENOMEM;
>>>>  - stream = ZSTD_initCStream(params, 0, workspace, workspace_size);
>>>> -  if (!stream) {
>>>> -  printk_ratelimited("%sF2

Re: [PATCH 5/9] btrfs: zstd: Switch to the zstd-1.4.6 API

2020-09-17 Thread Nick Terrell


> On Sep 17, 2020, at 7:28 AM, Chris Mason  wrote:
> 
> On 17 Sep 2020, at 6:04, Christoph Hellwig wrote:
> 
>> On Wed, Sep 16, 2020 at 09:35:51PM -0400, Rik van Riel wrote:
 One possibility is to have a kernel wrapper on top of the zstd API to
 make it
 more ergonomic. I personally don???t really see the value in it, since
 it adds
 another layer of indirection between zstd and the caller, but it
 could be done.
>>> 
>>> Zstd would not be the first part of the kernel to
>>> come from somewhere else, and have wrappers when
>>> it gets integrated into the kernel. There certainly
>>> is precedence there.
>>> 
>>> It would be interesting to know what Christoph's
>>> preference is.
>> 
>> Yes, I think kernel wrappers would be a pretty sensible step forward.
>> That also avoid the need to do strange upgrades to a new version,
>> and instead we can just change APIs on a as-needed basis.
> 
> When we add wrappers, we end up creating a kernel specific API that doesn’t 
> match the upstream zstd docs, and it doesn’t leverage as much of the zstd 
> fuzzing and testing.
> 
> So we’re actually making kernel zstd slightly less usable in hopes that our 
> kernel specific part of the API is familiar enough to us that it makes zstd 
> more usable.  There’s no way to compare the two until the wrappers are done, 
> but given the code today I’d prefer that we focus on making it really easy to 
> track upstream.  I really understand Christoph’s side here, but I’d rather 
> ride a camel with the group than go it alone.
> 
> I’d also much rather spend time on any problems where the structure of the 
> zstd APIs don’t fit the kernel’s needs.  The btrfs streaming 
> compression/decompression looks pretty clean to me, but I think Johannes 
> mentioned some possibilities to improve things for zswap (optimizations for 
> page-at-atime).  If there are places where the zstd memory management or 
> error handling don’t fit naturally into the kernel, that would also be higher 
> on my list.

This update includes the recent optimizations for ZSwap that I've made, which
gives a 30% speed boost for page-at-a-time decompression.

We're very open to improving and changing zstd to better fit the needs of the
kernel. If there are use cases that can't use the existing API, or the existing
API isn't optimal, or any other problems, we’re happy to help figure out the
best solution. Opening an issue on our upstream GitHub repo is the best way to
get our attention

-Nick

> Fixing those are probably going to be much easier if we’re close to the zstd 
> upstream, again so that we can leverage testing and long term code 
> maintenance done there.
> 
> -chris



Re: [PATCH 1/9] lib: zstd: Add zstd compatibility wrapper

2020-09-17 Thread Nick Terrell


> On Sep 16, 2020, at 1:48 AM, Christoph Hellwig  wrote:
> 
> On Tue, Sep 15, 2020 at 08:42:54PM -0700, Nick Terrell wrote:
>> From: Nick Terrell 
>> 
>> Adds zstd_compat.h which provides the necessary functions from the
>> current zstd.h API. It is only active for zstd versions 1.4.6 and newer.
>> That means it is disabled currently, but will become active when a later
>> patch in this series updates the zstd library in the kernel to 1.4.6.
>> 
>> This header allows the zstd upgrade to 1.4.6 without changing any
>> callers, since they all include zstd through the compatibility wrapper.
>> Later patches in this series transition each caller away from the
>> compatibility wrapper. After all the callers have been transitioned away
>> from the compatibility wrapper, the final patch in this series deletes
>> it.
> 
> Please just add wrappes to the main header instead of causing all
> this churn.

The goal of having it in a separate header is so the 3rd patch that actually
updates zstd can be 100% automatically generated. I didn’t want to mix
a small amount of edits into a large generated patch, because that would
be easy to miss.

Re: [PATCH 5/9] btrfs: zstd: Switch to the zstd-1.4.6 API

2020-09-17 Thread Nick Terrell


> On Sep 16, 2020, at 7:46 AM, Christoph Hellwig  wrote:
> 
> On Wed, Sep 16, 2020 at 10:43:04AM -0400, Chris Mason wrote:
>> Otherwise we just end up with drift and kernel-specific bugs that are harder
>> to debug.  To the extent those APIs make us contort the kernel code, I???m
>> sure Nick is interested in improving things in both places.
> 
> Seriously, we do not care elsewhere.  Why would zlib be any different?
> 
>> There are probably 1000 constructive ways to have that conversation.  Please
>> choose one of those instead of being an asshole.
> 
> I think you are the asshole here by ignoring the practices we are using
> elsewhere and think your employers pet project is somehow special.  It
> is not, and claiming so is everything but constructive.

My goal in updating the zstd kernel to use the upstream API directly is to
make frequent syncs into the kernel easy. This is important so the kernel
doesn't miss out on bug fixes and performance improvements.

The upstream zstd is continuously fuzzed and is battle tested in production
and across many different projects external to Facebook. That means that
zstd-1.4.6 has an additional 3 years of continuous fuzzing, as well as
improvements to our fuzz and test suite.

The zstd version in the kernel works fine. But, you can see that the version
that got imported stagnated where upstream had 14 released versions. I
don't think it makes sense to have kernel developers maintain their own copy
of zstd. Their time would be better spent working on the rest of the kernel.
Using upstream directly lets the kernel profit from the work that we, the zstd
developers, are doing. And it still allows kernel developers to fix bugs if any
show up, and we can back-port them to upstream.

For example, I’ve measured that BtrFS decompression + read performance
is improved 15% with this patch. And ZRAM performance improves 30%.
And SquashFS decompression + read performance improves 15%.

Admittedly, the API provided for static workspace allocation is verbose. Most
zstd users don’t need it, so our efforts to improve the ergonomics of the API
haven’t been focused here. At this point, we couldn’t rename these APIs easily,
since we have users relying on our API. It could be done, because we don’t
guarantee ABI stability for this portion of the API, but we would have to have
a good reason for it.

One possibility is to have a kernel wrapper on top of the zstd API to make it
more ergonomic. I personally don’t really see the value in it, since it adds
another layer of indirection between zstd and the caller, but it could be done.

Of all the compressors in the kernel, only lz4 and zstd are under active
development. And lz4 has switched to using the upstream API directly.
Xz does see a little bit of development, but nothing has been synced to the
kernel.

Best,
Nick

Re: [PATCH 3/9] lib: zstd: Upgrade to latest upstream zstd version 1.4.6

2020-09-16 Thread Nick Terrell
> On Sep 15, 2020, at 8:42 PM, Nick Terrell  wrote:
> 
> From: Nick Terrell 
> 
> Upgrade to the latest upstream zstd version 1.4.6.
> 
> This patch is 100% generated from upstream zstd commit c4763f087c2b [0].
> 
> This patch is very large because it is transitioning from the custom
> kernel zstd to using upstream directly. The new zstd follows upstreams
> file structure which is different. Future update patches will be much
> smaller because they will only contain the changes from one upstream
> zstd release.
> 
> The benefits of this patch are as follows:
> 1. Using upstream directly with automated script to generate kernel
>   code. This allows us to update the kernel every upstream release, so
>   the kernel gets the latest bug fixes and performance improvements,
>   and doesn't get 3 years out of date again. The automation and the
>   translated code are tested every upstream commit to ensure it
>   continues to work.
> 2. Upgrades from a custom zstd based on 1.3.1 to 1.4.6, getting 3 years
>   of performance improvements and bug fixes. On x86_64 I've measured
>   15% faster BtrFS and SquashFS decompression+read speeds, 35% faster
>   kernel decompression, and 30% faster ZRAM decompression+read speeds.
>   Additionally, the latest zstd uses ~1 KB less stack space for
>   compression.
> 3. Switches to using the upstream API directly. It is slightly less
>   ergonomic for the kernel use case, where malloc/free aren't provided.
>   But, it means that users don't need to familiarize themselves with 2
>   zstd APIs.
> 
> I chose the bulk update instead of replaying upstream commits because
> there have been ~3500 upstream commits since the 1.3.1 release, zstd
> wasn't ready to be used in the kernel as-is before a month ago, and not
> all upstream zstd commits build. The bulk update preserves bisectablity
> because bugs can be bisected to the zstd version update. At that point
> the update can be reverted, and we can work with upstream to find and
> fix the bug.
> 
> Note that upstream zstd release 1.4.6 doesn't exist yet. I have cut a
> staging branch at c4763f087c2b [0] and will apply any changes requested
> to the staging branch. Once we're ready to merge this update I will cut
> a zstd release at the commit we merge, so we have a known zstd release
> in the kernel.
> 
> [0] 
> https://github.com/facebook/zstd/commit/c4763f087c2b4b5857a8323ff3360b240db23786
> 
> Signed-off-by: Nick Terrell 

Below is a diff that shows the difference between upstream zstd imported
directly into the kernel, and the version in this patch that uses upstreams
automation generate a working zstd. I hope it is helpful for review, since I
know the full patch is way to large for a meaningful review.

The automation does several necessary things:
* Rewrite libc headers
* Replace bundled xxhash with kernel xxhash
* Provide zstd_deps.h, which holds all of zstd’s libc dependencies

It also hardwires certain preprocessor macros to avoid unnecessary
portability code in the kernel. This is not strictly necessary, because these
macros could be defined at compile time. See [0] for a list of macros.

This diff is also available at [0].

[0] https://gist.github.com/terrelln/5a266ef4f6ee8bc60dde192daaaf2c97
[1] 
https://github.com/facebook/zstd/blob/d96e98cfde66e9e20dcadcfd9ed3b82ba648adfe/contrib/linux-kernel/Makefile#L17

Best,
Nick

---
 include/linux/zstd.h  |  28 +---
 include/linux/zstd_errors.h   |  24 +--
 lib/zstd/common/bitstream.h   |  28 +---
 lib/zstd/common/compiler.h|  91 ++-
 lib/zstd/common/cpu.h |  21 +--
 lib/zstd/common/debug.h   |   6 -
 lib/zstd/common/entropy_common.c  |   7 +-
 lib/zstd/common/error_private.h   |  18 +--
 lib/zstd/common/fse.h |   8 +-
 lib/zstd/common/fse_decompress.c  |  13 --
 lib/zstd/common/huf.h |   8 +-
 lib/zstd/common/mem.h |  77 +-
 lib/zstd/common/zstd_deps.h   | 110 ++---
 lib/zstd/common/zstd_internal.h   |  35 +
 lib/zstd/compress/fse_compress.c  |  80 --
 lib/zstd/compress/hist.c  |  16 --
 lib/zstd/compress/huf_compress.c  |  35 -
 lib/zstd/compress/zstd_compress.c | 135 +---
 lib/zstd/compress/zstd_compress_internal.h|  42 +
 lib/zstd/compress/zstd_compress_superblock.h  |   2 +-
 lib/zstd/compress/zstd_cwksp.h|   6 -
 lib/zstd/compress/zstd_double_fast.h  |   6 -
 lib/zstd/compress/zstd_fast.h |   6 -
 lib/zstd/compress/zstd_lazy.c |   4 +-
 lib/zstd/compress/zstd_lazy.h |   6 -
 lib/zstd/com

Re: [PATCH 6/9] f2fs: zstd: Switch to the zstd-1.4.6 API

2020-09-16 Thread Nick Terrell


> On Sep 15, 2020, at 11:31 PM, Chao Yu  wrote:
> 
> Hi Nick,
> 
> remove not related mailing list.
> 
> On 2020/9/16 11:43, Nick Terrell wrote:
>> From: Nick Terrell 
>> Move away from the compatibility wrapper to the zstd-1.4.6 API. This
>> code is more efficient because it uses the single-pass API instead of
>> the streaming API. The streaming API is not necessary because the whole
>> input and output buffers are available. This saves memory because we
>> don't need to allocate a buffer for the window. It is also more
>> efficient because it saves unnecessary memcpy calls.
>> I've had problems testing this code because I see data truncation before
>> and after this patchset. Help testing this patch would be much
>> appreciated.
> 
> Can you please explain more about data truncation? I'm a little confused...
> 
> Do you mean that f2fs doesn't allocate enough memory for zstd compression,
> so that compression is not finished actually, the compressed data is truncated
> at dst buffer?

Hi Chao,

I’ve tested F2FS using a benchmark I adapted from testing BtrFS [0]. It is 
possible
that the script I’m using is buggy or is exposing an edge case in F2FS. The 
files
that I copy to F2FS and compress end up truncated with a hole at the end.

It is based off of upstream commit ab29a807a7.

E.g. the end of the copied file looks like this, but the original file has 
non-zero data
In the end. Until the hole at the end the file is correct.

od dickens | tail -n 5
> 46667760 067502 066167 020056 040440 020163 023511 006555 060412
> 4667 00 00 00 00 00 00 00 00
> *
> 46703060 00 00 00 00 00 00 00
> 46703076

[0] https://gist.github.com/terrelln/7dd2919937dfbdb8e839e4ad11c81db4

Best,
Nick

> Thanks,
> 
>> Signed-off-by: Nick Terrell 
>> ---
>>  fs/f2fs/compress.c | 102 +
>>  1 file changed, 38 insertions(+), 64 deletions(-)
>> diff --git a/fs/f2fs/compress.c b/fs/f2fs/compress.c
>> index e056f3a2b404..b79efce81651 100644
>> --- a/fs/f2fs/compress.c
>> +++ b/fs/f2fs/compress.c
>> @@ -11,7 +11,8 @@
>>  #include 
>>  #include 
>>  #include 
>> -#include 
>> +#include 
>> +#include 
>>#include "f2fs.h"
>>  #include "node.h"
>> @@ -298,21 +299,21 @@ static const struct f2fs_compress_ops f2fs_lz4_ops = {
>>  static int zstd_init_compress_ctx(struct compress_ctx *cc)
>>  {
>>  ZSTD_parameters params;
>> -ZSTD_CStream *stream;
>> +ZSTD_CCtx *ctx;
>>  void *workspace;
>>  unsigned int workspace_size;
>>  params = ZSTD_getParams(F2FS_ZSTD_DEFAULT_CLEVEL, cc->rlen, 0);
>> -workspace_size = ZSTD_CStreamWorkspaceBound(params.cParams);
>> +workspace_size = ZSTD_estimateCCtxSize_usingCParams(params.cParams);
>>  workspace = f2fs_kvmalloc(F2FS_I_SB(cc->inode),
>>  workspace_size, GFP_NOFS);
>>  if (!workspace)
>>  return -ENOMEM;
>>  -   stream = ZSTD_initCStream(params, 0, workspace, workspace_size);
>> -if (!stream) {
>> -printk_ratelimited("%sF2FS-fs (%s): %s ZSTD_initCStream 
>> failed\n",
>> +ctx = ZSTD_initStaticCCtx(workspace, workspace_size);
>> +if (!ctx) {
>> +printk_ratelimited("%sF2FS-fs (%s): %s ZSTD_inittaticCStream 
>> failed\n",
>>  KERN_ERR, F2FS_I_SB(cc->inode)->sb->s_id,
>>  __func__);
>>  kvfree(workspace);
>> @@ -320,7 +321,7 @@ static int zstd_init_compress_ctx(struct compress_ctx 
>> *cc)
>>  }
>>  cc->private = workspace;
>> -cc->private2 = stream;
>> +cc->private2 = ctx;
>>  cc->clen = cc->rlen - PAGE_SIZE - COMPRESS_HEADER_SIZE;
>>  return 0;
>> @@ -335,65 +336,48 @@ static void zstd_destroy_compress_ctx(struct 
>> compress_ctx *cc)
>>static int zstd_compress_pages(struct compress_ctx *cc)
>>  {
>> -ZSTD_CStream *stream = cc->private2;
>> -ZSTD_inBuffer inbuf;
>> -ZSTD_outBuffer outbuf;
>> -int src_size = cc->rlen;
>> -int dst_size = src_size - PAGE_SIZE - COMPRESS_HEADER_SIZE;
>> -int ret;
>> -
>> -inbuf.pos = 0;
>> -inbuf.src = cc->rbuf;
>> -inbuf.size = src_size;
>> -
>> -outbuf.pos = 0;
>> -outbuf.dst = cc->cbuf->cdata;
>> -outbuf.size = dst_size;
>> -
>> -ret = ZSTD_compressStream(stream, , );
>> -  

[PATCH 4/9] crypto: zstd: Switch to zstd-1.4.6 API

2020-09-15 Thread Nick Terrell
From: Nick Terrell 

Move away from the compatibility wrapper to the zstd-1.4.6 API. This
code is functionally equivalent.

Signed-off-by: Nick Terrell 
---
 crypto/zstd.c | 24 +++-
 1 file changed, 11 insertions(+), 13 deletions(-)

diff --git a/crypto/zstd.c b/crypto/zstd.c
index dcda3cad3b5c..767fe2fbe009 100644
--- a/crypto/zstd.c
+++ b/crypto/zstd.c
@@ -11,7 +11,7 @@
 #include 
 #include 
 #include 
-#include 
+#include 
 #include 
 
 
@@ -24,16 +24,15 @@ struct zstd_ctx {
void *dwksp;
 };
 
-static ZSTD_parameters zstd_params(void)
-{
-   return ZSTD_getParams(ZSTD_DEF_LEVEL, 0, 0);
-}
-
 static int zstd_comp_init(struct zstd_ctx *ctx)
 {
int ret = 0;
-   const ZSTD_parameters params = zstd_params();
-   const size_t wksp_size = ZSTD_CCtxWorkspaceBound(params.cParams);
+   const size_t wksp_size = ZSTD_estimateCCtxSize(ZSTD_DEF_LEVEL);
+
+   if (ZSTD_isError(wksp_size)) {
+   ret = -EINVAL;
+   goto out_free;
+   }
 
ctx->cwksp = vzalloc(wksp_size);
if (!ctx->cwksp) {
@@ -41,7 +40,7 @@ static int zstd_comp_init(struct zstd_ctx *ctx)
goto out;
}
 
-   ctx->cctx = ZSTD_initCCtx(ctx->cwksp, wksp_size);
+   ctx->cctx = ZSTD_initStaticCCtx(ctx->cwksp, wksp_size);
if (!ctx->cctx) {
ret = -EINVAL;
goto out_free;
@@ -56,7 +55,7 @@ static int zstd_comp_init(struct zstd_ctx *ctx)
 static int zstd_decomp_init(struct zstd_ctx *ctx)
 {
int ret = 0;
-   const size_t wksp_size = ZSTD_DCtxWorkspaceBound();
+   const size_t wksp_size = ZSTD_estimateDCtxSize();
 
ctx->dwksp = vzalloc(wksp_size);
if (!ctx->dwksp) {
@@ -64,7 +63,7 @@ static int zstd_decomp_init(struct zstd_ctx *ctx)
goto out;
}
 
-   ctx->dctx = ZSTD_initDCtx(ctx->dwksp, wksp_size);
+   ctx->dctx = ZSTD_initStaticDCtx(ctx->dwksp, wksp_size);
if (!ctx->dctx) {
ret = -EINVAL;
goto out_free;
@@ -152,9 +151,8 @@ static int __zstd_compress(const u8 *src, unsigned int slen,
 {
size_t out_len;
struct zstd_ctx *zctx = ctx;
-   const ZSTD_parameters params = zstd_params();
 
-   out_len = ZSTD_compressCCtx(zctx->cctx, dst, *dlen, src, slen, params);
+   out_len = ZSTD_compressCCtx(zctx->cctx, dst, *dlen, src, slen, 
ZSTD_DEF_LEVEL);
if (ZSTD_isError(out_len))
return -EINVAL;
*dlen = out_len;
-- 
2.28.0



[PATCH 7/9] squashfs: zstd: Switch to the zstd-1.4.6 API

2020-09-15 Thread Nick Terrell
From: Nick Terrell 

Move away from the compatibility wrapper to the zstd-1.4.6 API. This
code is functionally equivalent.

Signed-off-by: Nick Terrell 
---
 fs/squashfs/zstd_wrapper.c | 9 -
 1 file changed, 4 insertions(+), 5 deletions(-)

diff --git a/fs/squashfs/zstd_wrapper.c b/fs/squashfs/zstd_wrapper.c
index f8c512a6204e..add582409866 100644
--- a/fs/squashfs/zstd_wrapper.c
+++ b/fs/squashfs/zstd_wrapper.c
@@ -11,7 +11,7 @@
 #include 
 #include 
 #include 
-#include 
+#include 
 #include 
 
 #include "squashfs_fs.h"
@@ -34,7 +34,7 @@ static void *zstd_init(struct squashfs_sb_info *msblk, void 
*buff)
goto failed;
wksp->window_size = max_t(size_t,
msblk->block_size, SQUASHFS_METADATA_SIZE);
-   wksp->mem_size = ZSTD_DStreamWorkspaceBound(wksp->window_size);
+   wksp->mem_size = ZSTD_estimateDStreamSize(wksp->window_size);
wksp->mem = vmalloc(wksp->mem_size);
if (wksp->mem == NULL)
goto failed;
@@ -71,7 +71,7 @@ static int zstd_uncompress(struct squashfs_sb_info *msblk, 
void *strm,
struct bvec_iter_all iter_all = {};
struct bio_vec *bvec = bvec_init_iter_all(_all);
 
-   stream = ZSTD_initDStream(wksp->window_size, wksp->mem, wksp->mem_size);
+   stream = ZSTD_initStaticDStream(wksp->mem, wksp->mem_size);
 
if (!stream) {
ERROR("Failed to initialize zstd decompressor\n");
@@ -122,8 +122,7 @@ static int zstd_uncompress(struct squashfs_sb_info *msblk, 
void *strm,
break;
 
if (ZSTD_isError(zstd_err)) {
-   ERROR("zstd decompression error: %d\n",
-   (int)ZSTD_getErrorCode(zstd_err));
+   ERROR("zstd decompression error: %s\n", 
ZSTD_getErrorName(zstd_err));
error = -EIO;
break;
}
-- 
2.28.0



[PATCH 9/9] lib: zstd: Remove zstd compatibility wrapper

2020-09-15 Thread Nick Terrell
From: Nick Terrell 

All callers have been transitioned to the new zstd-1.4.6 API. There are
no more callers of the zstd compatibility wrapper, so delete it.

Signed-off-by: Nick Terrell 
---
 include/linux/zstd_compat.h | 112 
 1 file changed, 112 deletions(-)
 delete mode 100644 include/linux/zstd_compat.h

diff --git a/include/linux/zstd_compat.h b/include/linux/zstd_compat.h
deleted file mode 100644
index 11acf14d9d70..
--- a/include/linux/zstd_compat.h
+++ /dev/null
@@ -1,112 +0,0 @@
-/*
- * Copyright (c) 2016-present, Facebook, Inc.
- * All rights reserved.
- *
- * This source code is licensed under the BSD-style license found in the
- * LICENSE file in the root directory of https://github.com/facebook/zstd.
- * An additional grant of patent rights can be found in the PATENTS file in the
- * same directory.
- *
- * This program is free software; you can redistribute it and/or modify it 
under
- * the terms of the GNU General Public License version 2 as published by the
- * Free Software Foundation. This program is dual-licensed; you may select
- * either version 2 of the GNU General Public License ("GPL") or BSD license
- * ("BSD").
- */
-
-#ifndef ZSTD_COMPAT_H
-#define ZSTD_COMPAT_H
-
-#include 
-
-#if defined(ZSTD_VERSION_NUMBER) && (ZSTD_VERSION_NUMBER >= 10406)
-/*
- * This header provides backwards compatibility for the zstd-1.4.6 library
- * upgrade. This header allows us to upgrade the zstd library version without
- * modifying any callers. Then we will migrate callers from the compatibility
- * wrapper one at a time until none remain. At which point we will delete this
- * header.
- *
- * It is temporary and will be deleted once the upgrade is complete.
- */
-
-#include 
-
-static inline size_t ZSTD_CCtxWorkspaceBound(ZSTD_compressionParameters 
compression_params)
-{
-return ZSTD_estimateCCtxSize_usingCParams(compression_params);
-}
-
-static inline size_t ZSTD_CStreamWorkspaceBound(ZSTD_compressionParameters 
compression_params)
-{
-return ZSTD_estimateCStreamSize_usingCParams(compression_params);
-}
-
-static inline size_t ZSTD_DCtxWorkspaceBound(void)
-{
-return ZSTD_estimateDCtxSize();
-}
-
-static inline size_t ZSTD_DStreamWorkspaceBound(unsigned long long window_size)
-{
-return ZSTD_estimateDStreamSize(window_size);
-}
-
-static inline ZSTD_CCtx* ZSTD_initCCtx(void* wksp, size_t wksp_size)
-{
-if (wksp == NULL)
-return NULL;
-return ZSTD_initStaticCCtx(wksp, wksp_size);
-}
-
-static inline ZSTD_CStream* ZSTD_initCStream_compat(ZSTD_parameters params, 
size_t pledged_src_size, void* wksp, size_t wksp_size)
-{
-ZSTD_CStream* cstream;
-size_t ret;
-
-if (wksp == NULL)
-return NULL;
-
-cstream = ZSTD_initStaticCStream(wksp, wksp_size);
-if (cstream == NULL)
-return NULL;
-
-ret = ZSTD_initCStream_advanced(cstream, NULL, 0, params, 
pledged_src_size);
-if (ZSTD_isError(ret))
-return NULL;
-
-return cstream;
-}
-#define ZSTD_initCStream ZSTD_initCStream_compat
-
-static inline ZSTD_DCtx* ZSTD_initDCtx(void* wksp, size_t wksp_size)
-{
-if (wksp == NULL)
-return NULL;
-return ZSTD_initStaticDCtx(wksp, wksp_size);
-}
-
-static inline ZSTD_DStream* ZSTD_initDStream_compat(unsigned long long 
window_size, void* wksp, size_t wksp_size)
-{
-if (wksp == NULL)
-return NULL;
-(void)window_size;
-return ZSTD_initStaticDStream(wksp, wksp_size);
-}
-#define ZSTD_initDStream ZSTD_initDStream_compat
-
-typedef ZSTD_frameHeader ZSTD_frameParams;
-
-static inline size_t ZSTD_getFrameParams(ZSTD_frameParams* frame_params, const 
void* src, size_t src_size)
-{
-return ZSTD_getFrameHeader(frame_params, src, src_size);
-}
-
-static inline size_t ZSTD_compressCCtx_compat(ZSTD_CCtx* cctx, void* dst, 
size_t dst_capacity, const void* src, size_t src_size, ZSTD_parameters params)
-{
-return ZSTD_compress_advanced(cctx, dst, dst_capacity, src, src_size, 
NULL, 0, params);
-}
-#define ZSTD_compressCCtx ZSTD_compressCCtx_compat
-
-#endif /* ZSTD_VERSION_NUMBER >= 10406 */
-#endif /* ZSTD_COMPAT_H */
-- 
2.28.0



[PATCH 8/9] lib: unzstd: Switch to the zstd-1.4.6 API

2020-09-15 Thread Nick Terrell
From: Nick Terrell 

Move away from the compatibility wrapper to the zstd-1.4.6 API. This
code is functionally equivalent.

Signed-off-by: Nick Terrell 
---
 lib/decompress_unzstd.c | 40 ++--
 1 file changed, 14 insertions(+), 26 deletions(-)

diff --git a/lib/decompress_unzstd.c b/lib/decompress_unzstd.c
index a79f705f236d..d4685df0e120 100644
--- a/lib/decompress_unzstd.c
+++ b/lib/decompress_unzstd.c
@@ -73,7 +73,8 @@
 
 #include 
 #include 
-#include 
+#include 
+#include 
 
 /* 128MB is the maximum window size supported by zstd. */
 #define ZSTD_WINDOWSIZE_MAX(1 << ZSTD_WINDOWLOG_MAX)
@@ -120,9 +121,9 @@ static int INIT decompress_single(const u8 *in_buf, long 
in_len, u8 *out_buf,
  long out_len, long *in_pos,
  void (*error)(char *x))
 {
-   const size_t wksp_size = ZSTD_DCtxWorkspaceBound();
+   const size_t wksp_size = ZSTD_estimateDCtxSize();
void *wksp = large_malloc(wksp_size);
-   ZSTD_DCtx *dctx = ZSTD_initDCtx(wksp, wksp_size);
+   ZSTD_DCtx *dctx = ZSTD_initStaticDCtx(wksp, wksp_size);
int err;
size_t ret;
 
@@ -165,7 +166,6 @@ static int INIT __unzstd(unsigned char *in_buf, long in_len,
 {
ZSTD_inBuffer in;
ZSTD_outBuffer out;
-   ZSTD_frameParams params;
void *in_allocated = NULL;
void *out_allocated = NULL;
void *wksp = NULL;
@@ -229,36 +229,24 @@ static int INIT __unzstd(unsigned char *in_buf, long 
in_len,
out.size = out_len;
 
/*
-* We need to know the window size to allocate the ZSTD_DStream.
-* Since we are streaming, we need to allocate a buffer for the sliding
-* window. The window size varies from 1 KB to ZSTD_WINDOWSIZE_MAX
-* (8 MB), so it is important to use the actual value so as not to
-* waste memory when it is smaller.
+* Zstd determines the workspace size from the window size written
+* into the frame header. This ensures that we use the minimum value
+* possible, since the window size varies from 1 KB to 
ZSTD_WINDOWSIZE_MAX
+* (1 GB), so it is very important to use the actual value.
 */
-   ret = ZSTD_getFrameParams(, in.src, in.size);
+   wksp_size = ZSTD_estimateDStreamSize_fromFrame(in.src, in.size);
err = handle_zstd_error(ret, error);
if (err)
goto out;
-   if (ret != 0) {
-   error("ZSTD-compressed data has an incomplete frame header");
-   err = -1;
-   goto out;
-   }
-   if (params.windowSize > ZSTD_WINDOWSIZE_MAX) {
-   error("ZSTD-compressed data has too large a window size");
+   wksp = large_malloc(wksp_size);
+   if (wksp == NULL) {
+   error("Out of memory while allocating ZSTD_DStream");
err = -1;
goto out;
}
-
-   /*
-* Allocate the ZSTD_DStream now that we know how much memory is
-* required.
-*/
-   wksp_size = ZSTD_DStreamWorkspaceBound(params.windowSize);
-   wksp = large_malloc(wksp_size);
-   dstream = ZSTD_initDStream(params.windowSize, wksp, wksp_size);
+   dstream = ZSTD_initStaticDStream(wksp, wksp_size);
if (dstream == NULL) {
-   error("Out of memory while allocating ZSTD_DStream");
+   error("ZSTD_initStaticDStream failed");
err = -1;
goto out;
}
-- 
2.28.0



[PATCH 2/9] lib: zstd: Add decompress_sources.h for decompress_unzstd

2020-09-15 Thread Nick Terrell
From: Nick Terrell 

Adds decompress_sources.h which includes every .c file necessary for
zstd decompression. This is used in decompress_unzstd.c so the internal
structure of the library isn't exposed.

This allows us to upgrade the zstd library version without modifying any
callers. Instead we just need to update decompress_sources.h.

Signed-off-by: Nick Terrell 
---
 lib/decompress_unzstd.c   |  6 +-
 lib/zstd/decompress_sources.h | 14 ++
 2 files changed, 15 insertions(+), 5 deletions(-)
 create mode 100644 lib/zstd/decompress_sources.h

diff --git a/lib/decompress_unzstd.c b/lib/decompress_unzstd.c
index dbc290af26b4..a79f705f236d 100644
--- a/lib/decompress_unzstd.c
+++ b/lib/decompress_unzstd.c
@@ -68,11 +68,7 @@
 #ifdef STATIC
 # define UNZSTD_PREBOOT
 # include "xxhash.c"
-# include "zstd/entropy_common.c"
-# include "zstd/fse_decompress.c"
-# include "zstd/huf_decompress.c"
-# include "zstd/zstd_common.c"
-# include "zstd/decompress.c"
+# include "zstd/decompress_sources.h"
 #endif
 
 #include 
diff --git a/lib/zstd/decompress_sources.h b/lib/zstd/decompress_sources.h
new file mode 100644
index ..ccb4960ea0cd
--- /dev/null
+++ b/lib/zstd/decompress_sources.h
@@ -0,0 +1,14 @@
+// SPDX-License-Identifier: GPL-2.0-only
+
+/*
+ * This file includes every .c file needed for decompression.
+ * It is used by lib/decompress_unzstd.c to include the decompression
+ * source into the translation-unit, so it can be used for kernel
+ * decompression.
+ */
+
+#include "entropy_common.c"
+#include "fse_decompress.c"
+#include "huf_decompress.c"
+#include "zstd_common.c"
+#include "decompress.c"
-- 
2.28.0



[PATCH 6/9] f2fs: zstd: Switch to the zstd-1.4.6 API

2020-09-15 Thread Nick Terrell
From: Nick Terrell 

Move away from the compatibility wrapper to the zstd-1.4.6 API. This
code is more efficient because it uses the single-pass API instead of
the streaming API. The streaming API is not necessary because the whole
input and output buffers are available. This saves memory because we
don't need to allocate a buffer for the window. It is also more
efficient because it saves unnecessary memcpy calls.

I've had problems testing this code because I see data truncation before
and after this patchset. Help testing this patch would be much
appreciated.

Signed-off-by: Nick Terrell 
---
 fs/f2fs/compress.c | 102 +
 1 file changed, 38 insertions(+), 64 deletions(-)

diff --git a/fs/f2fs/compress.c b/fs/f2fs/compress.c
index e056f3a2b404..b79efce81651 100644
--- a/fs/f2fs/compress.c
+++ b/fs/f2fs/compress.c
@@ -11,7 +11,8 @@
 #include 
 #include 
 #include 
-#include 
+#include 
+#include 
 
 #include "f2fs.h"
 #include "node.h"
@@ -298,21 +299,21 @@ static const struct f2fs_compress_ops f2fs_lz4_ops = {
 static int zstd_init_compress_ctx(struct compress_ctx *cc)
 {
ZSTD_parameters params;
-   ZSTD_CStream *stream;
+   ZSTD_CCtx *ctx;
void *workspace;
unsigned int workspace_size;
 
params = ZSTD_getParams(F2FS_ZSTD_DEFAULT_CLEVEL, cc->rlen, 0);
-   workspace_size = ZSTD_CStreamWorkspaceBound(params.cParams);
+   workspace_size = ZSTD_estimateCCtxSize_usingCParams(params.cParams);
 
workspace = f2fs_kvmalloc(F2FS_I_SB(cc->inode),
workspace_size, GFP_NOFS);
if (!workspace)
return -ENOMEM;
 
-   stream = ZSTD_initCStream(params, 0, workspace, workspace_size);
-   if (!stream) {
-   printk_ratelimited("%sF2FS-fs (%s): %s ZSTD_initCStream 
failed\n",
+   ctx = ZSTD_initStaticCCtx(workspace, workspace_size);
+   if (!ctx) {
+   printk_ratelimited("%sF2FS-fs (%s): %s ZSTD_inittaticCStream 
failed\n",
KERN_ERR, F2FS_I_SB(cc->inode)->sb->s_id,
__func__);
kvfree(workspace);
@@ -320,7 +321,7 @@ static int zstd_init_compress_ctx(struct compress_ctx *cc)
}
 
cc->private = workspace;
-   cc->private2 = stream;
+   cc->private2 = ctx;
 
cc->clen = cc->rlen - PAGE_SIZE - COMPRESS_HEADER_SIZE;
return 0;
@@ -335,65 +336,48 @@ static void zstd_destroy_compress_ctx(struct compress_ctx 
*cc)
 
 static int zstd_compress_pages(struct compress_ctx *cc)
 {
-   ZSTD_CStream *stream = cc->private2;
-   ZSTD_inBuffer inbuf;
-   ZSTD_outBuffer outbuf;
-   int src_size = cc->rlen;
-   int dst_size = src_size - PAGE_SIZE - COMPRESS_HEADER_SIZE;
-   int ret;
-
-   inbuf.pos = 0;
-   inbuf.src = cc->rbuf;
-   inbuf.size = src_size;
-
-   outbuf.pos = 0;
-   outbuf.dst = cc->cbuf->cdata;
-   outbuf.size = dst_size;
-
-   ret = ZSTD_compressStream(stream, , );
-   if (ZSTD_isError(ret)) {
-   printk_ratelimited("%sF2FS-fs (%s): %s ZSTD_compressStream 
failed, ret: %d\n",
-   KERN_ERR, F2FS_I_SB(cc->inode)->sb->s_id,
-   __func__, ZSTD_getErrorCode(ret));
-   return -EIO;
-   }
-
-   ret = ZSTD_endStream(stream, );
+   ZSTD_CCtx *ctx = cc->private2;
+   const size_t src_size = cc->rlen;
+   const size_t dst_size = src_size - PAGE_SIZE - COMPRESS_HEADER_SIZE;
+   ZSTD_parameters params = ZSTD_getParams(F2FS_ZSTD_DEFAULT_CLEVEL, 
src_size, 0);
+   size_t ret;
+
+   ret = ZSTD_compress_advanced(
+   ctx, cc->cbuf->cdata, dst_size, cc->rbuf, src_size, 
NULL, 0, params);
if (ZSTD_isError(ret)) {
-   printk_ratelimited("%sF2FS-fs (%s): %s ZSTD_endStream returned 
%d\n",
+   /*
+* there is compressed data remained in intermediate buffer due 
to
+* no more space in cbuf.cdata
+*/
+   if (ZSTD_getErrorCode(ret) == ZSTD_error_dstSize_tooSmall)
+   return -EAGAIN;
+   /* other compression errors return -EIO */
+   printk_ratelimited("%sF2FS-fs (%s): %s ZSTD_compress_advanced 
failed, err: %s\n",
KERN_ERR, F2FS_I_SB(cc->inode)->sb->s_id,
-   __func__, ZSTD_getErrorCode(ret));
+   __func__, ZSTD_getErrorName(ret));
return -EIO;
}
 
-   /*
-* there is compressed data remained in intermediate buffer due to
-* no more space in cbuf.cdata
-*/
-   if (ret)
-   return -EAGAIN;
-
-   cc->clen = outbuf.

[PATCH 5/9] btrfs: zstd: Switch to the zstd-1.4.6 API

2020-09-15 Thread Nick Terrell
From: Nick Terrell 

Move away from the compatibility wrapper to the zstd-1.4.6 API. This
code is functionally equivalent.

Signed-off-by: Nick Terrell 
---
 fs/btrfs/zstd.c | 48 
 1 file changed, 28 insertions(+), 20 deletions(-)

diff --git a/fs/btrfs/zstd.c b/fs/btrfs/zstd.c
index a7367ff573d4..6b466e090cd7 100644
--- a/fs/btrfs/zstd.c
+++ b/fs/btrfs/zstd.c
@@ -16,7 +16,7 @@
 #include 
 #include 
 #include 
-#include 
+#include 
 #include "misc.h"
 #include "compression.h"
 #include "ctree.h"
@@ -159,8 +159,8 @@ static void zstd_calc_ws_mem_sizes(void)
zstd_get_btrfs_parameters(level, ZSTD_BTRFS_MAX_INPUT);
size_t level_size =
max_t(size_t,
- ZSTD_CStreamWorkspaceBound(params.cParams),
- ZSTD_DStreamWorkspaceBound(ZSTD_BTRFS_MAX_INPUT));
+ 
ZSTD_estimateCStreamSize_usingCParams(params.cParams),
+ ZSTD_estimateDStreamSize(ZSTD_BTRFS_MAX_INPUT));
 
max_size = max_t(size_t, max_size, level_size);
zstd_ws_mem_sizes[level - 1] = max_size;
@@ -389,13 +389,23 @@ int zstd_compress_pages(struct list_head *ws, struct 
address_space *mapping,
*total_in = 0;
 
/* Initialize the stream */
-   stream = ZSTD_initCStream(params, len, workspace->mem,
-   workspace->size);
+   stream = ZSTD_initStaticCStream(workspace->mem, workspace->size);
if (!stream) {
-   pr_warn("BTRFS: ZSTD_initCStream failed\n");
+   pr_warn("BTRFS: ZSTD_initStaticCStream failed\n");
ret = -EIO;
goto out;
}
+   {
+   size_t ret2;
+
+   ret2 = ZSTD_initCStream_advanced(stream, NULL, 0, params, len);
+   if (ZSTD_isError(ret2)) {
+   pr_warn("BTRFS: ZSTD_initCStream_advanced returned 
%s\n",
+   ZSTD_getErrorName(ret2));
+   ret = -EIO;
+   goto out;
+   }
+   }
 
/* map in the first page of input data */
in_page = find_get_page(mapping, start >> PAGE_SHIFT);
@@ -421,8 +431,8 @@ int zstd_compress_pages(struct list_head *ws, struct 
address_space *mapping,
ret2 = ZSTD_compressStream(stream, >out_buf,
>in_buf);
if (ZSTD_isError(ret2)) {
-   pr_debug("BTRFS: ZSTD_compressStream returned %d\n",
-   ZSTD_getErrorCode(ret2));
+   pr_debug("BTRFS: ZSTD_compressStream returned %s\n",
+   ZSTD_getErrorName(ret2));
ret = -EIO;
goto out;
}
@@ -489,8 +499,8 @@ int zstd_compress_pages(struct list_head *ws, struct 
address_space *mapping,
 
ret2 = ZSTD_endStream(stream, >out_buf);
if (ZSTD_isError(ret2)) {
-   pr_debug("BTRFS: ZSTD_endStream returned %d\n",
-   ZSTD_getErrorCode(ret2));
+   pr_debug("BTRFS: ZSTD_endStream returned %s\n",
+   ZSTD_getErrorName(ret2));
ret = -EIO;
goto out;
}
@@ -557,10 +567,9 @@ int zstd_decompress_bio(struct list_head *ws, struct 
compressed_bio *cb)
unsigned long buf_start;
unsigned long total_out = 0;
 
-   stream = ZSTD_initDStream(
-   ZSTD_BTRFS_MAX_INPUT, workspace->mem, workspace->size);
+   stream = ZSTD_initStaticDStream(workspace->mem, workspace->size);
if (!stream) {
-   pr_debug("BTRFS: ZSTD_initDStream failed\n");
+   pr_debug("BTRFS: ZSTD_initStaticDStream failed\n");
ret = -EIO;
goto done;
}
@@ -579,8 +588,8 @@ int zstd_decompress_bio(struct list_head *ws, struct 
compressed_bio *cb)
ret2 = ZSTD_decompressStream(stream, >out_buf,
>in_buf);
if (ZSTD_isError(ret2)) {
-   pr_debug("BTRFS: ZSTD_decompressStream returned %d\n",
-   ZSTD_getErrorCode(ret2));
+   pr_debug("BTRFS: ZSTD_decompressStream returned %s\n",
+   ZSTD_getErrorName(ret2));
ret = -EIO;
goto done;
}
@@ -633,10 +642,9 @@ int zstd_decompress(struct list_head *ws, unsigned char 
*data_in,
unsigned long pg_offset = 0;
char *kaddr;
 
-

[PATCH 1/9] lib: zstd: Add zstd compatibility wrapper

2020-09-15 Thread Nick Terrell
From: Nick Terrell 

Adds zstd_compat.h which provides the necessary functions from the
current zstd.h API. It is only active for zstd versions 1.4.6 and newer.
That means it is disabled currently, but will become active when a later
patch in this series updates the zstd library in the kernel to 1.4.6.

This header allows the zstd upgrade to 1.4.6 without changing any
callers, since they all include zstd through the compatibility wrapper.
Later patches in this series transition each caller away from the
compatibility wrapper. After all the callers have been transitioned away
from the compatibility wrapper, the final patch in this series deletes
it.

Signed-off-by: Nick Terrell 
---
 crypto/zstd.c   |   2 +-
 fs/btrfs/zstd.c |   2 +-
 fs/f2fs/compress.c  |   2 +-
 fs/squashfs/zstd_wrapper.c  |   2 +-
 include/linux/zstd_compat.h | 112 
 lib/decompress_unzstd.c |   2 +-
 6 files changed, 117 insertions(+), 5 deletions(-)
 create mode 100644 include/linux/zstd_compat.h

diff --git a/crypto/zstd.c b/crypto/zstd.c
index 1a3309f066f7..dcda3cad3b5c 100644
--- a/crypto/zstd.c
+++ b/crypto/zstd.c
@@ -11,7 +11,7 @@
 #include 
 #include 
 #include 
-#include 
+#include 
 #include 
 
 
diff --git a/fs/btrfs/zstd.c b/fs/btrfs/zstd.c
index 9a4871636c6c..a7367ff573d4 100644
--- a/fs/btrfs/zstd.c
+++ b/fs/btrfs/zstd.c
@@ -16,7 +16,7 @@
 #include 
 #include 
 #include 
-#include 
+#include 
 #include "misc.h"
 #include "compression.h"
 #include "ctree.h"
diff --git a/fs/f2fs/compress.c b/fs/f2fs/compress.c
index 1dfb126a0cb2..e056f3a2b404 100644
--- a/fs/f2fs/compress.c
+++ b/fs/f2fs/compress.c
@@ -11,7 +11,7 @@
 #include 
 #include 
 #include 
-#include 
+#include 
 
 #include "f2fs.h"
 #include "node.h"
diff --git a/fs/squashfs/zstd_wrapper.c b/fs/squashfs/zstd_wrapper.c
index b7cb1faa652d..f8c512a6204e 100644
--- a/fs/squashfs/zstd_wrapper.c
+++ b/fs/squashfs/zstd_wrapper.c
@@ -11,7 +11,7 @@
 #include 
 #include 
 #include 
-#include 
+#include 
 #include 
 
 #include "squashfs_fs.h"
diff --git a/include/linux/zstd_compat.h b/include/linux/zstd_compat.h
new file mode 100644
index ..11acf14d9d70
--- /dev/null
+++ b/include/linux/zstd_compat.h
@@ -0,0 +1,112 @@
+/*
+ * Copyright (c) 2016-present, Facebook, Inc.
+ * All rights reserved.
+ *
+ * This source code is licensed under the BSD-style license found in the
+ * LICENSE file in the root directory of https://github.com/facebook/zstd.
+ * An additional grant of patent rights can be found in the PATENTS file in the
+ * same directory.
+ *
+ * This program is free software; you can redistribute it and/or modify it 
under
+ * the terms of the GNU General Public License version 2 as published by the
+ * Free Software Foundation. This program is dual-licensed; you may select
+ * either version 2 of the GNU General Public License ("GPL") or BSD license
+ * ("BSD").
+ */
+
+#ifndef ZSTD_COMPAT_H
+#define ZSTD_COMPAT_H
+
+#include 
+
+#if defined(ZSTD_VERSION_NUMBER) && (ZSTD_VERSION_NUMBER >= 10406)
+/*
+ * This header provides backwards compatibility for the zstd-1.4.6 library
+ * upgrade. This header allows us to upgrade the zstd library version without
+ * modifying any callers. Then we will migrate callers from the compatibility
+ * wrapper one at a time until none remain. At which point we will delete this
+ * header.
+ *
+ * It is temporary and will be deleted once the upgrade is complete.
+ */
+
+#include 
+
+static inline size_t ZSTD_CCtxWorkspaceBound(ZSTD_compressionParameters 
compression_params)
+{
+return ZSTD_estimateCCtxSize_usingCParams(compression_params);
+}
+
+static inline size_t ZSTD_CStreamWorkspaceBound(ZSTD_compressionParameters 
compression_params)
+{
+return ZSTD_estimateCStreamSize_usingCParams(compression_params);
+}
+
+static inline size_t ZSTD_DCtxWorkspaceBound(void)
+{
+return ZSTD_estimateDCtxSize();
+}
+
+static inline size_t ZSTD_DStreamWorkspaceBound(unsigned long long window_size)
+{
+return ZSTD_estimateDStreamSize(window_size);
+}
+
+static inline ZSTD_CCtx* ZSTD_initCCtx(void* wksp, size_t wksp_size)
+{
+if (wksp == NULL)
+return NULL;
+return ZSTD_initStaticCCtx(wksp, wksp_size);
+}
+
+static inline ZSTD_CStream* ZSTD_initCStream_compat(ZSTD_parameters params, 
size_t pledged_src_size, void* wksp, size_t wksp_size)
+{
+ZSTD_CStream* cstream;
+size_t ret;
+
+if (wksp == NULL)
+return NULL;
+
+cstream = ZSTD_initStaticCStream(wksp, wksp_size);
+if (cstream == NULL)
+return NULL;
+
+ret = ZSTD_initCStream_advanced(cstream, NULL, 0, params, 
pledged_src_size);
+if (ZSTD_isError(ret))
+return NULL;
+
+return cstream;
+}
+#define ZSTD_initCStream ZSTD_initCStream_compat
+
+static inline ZSTD_DCtx* ZSTD_initDCtx(void* wksp, size_t wksp_size)
+{
+if (wksp == NULL)
+retu

[PATCH 0/9] Update to zstd-1.4.6

2020-09-15 Thread Nick Terrell
From: Nick Terrell 

This patchset upgrades the zstd library to the latest upstream release. The
current zstd version in the kernel is a modified version of upstream zstd-1.3.1.
At the time it was integrated, zstd wasn't ready to be used in the kernel as-is.
But, it is now possible to use upstream zstd directly in the kernel.

I have not yet release zstd-1.4.6 upstream. I want the zstd version in the 
kernel
to match up with a known upstream release, so we know exactly what code is
running. Whenever this patchset is ready for merge, I will cut a release at the
upstream commit that gets merged. This should not be necessary for future
releases.

The kernel zstd library is automatically generated from upstream zstd. A script
makes the necessary changes and imports it into the kernel. The changes are:

1. Replace all libc dependencies with kernel replacements and rewrite includes.
2. Remove unncessary portability macros like: #if defined(_MSC_VER).
3. Use the kernel xxhash instead of bundling it.

This automation gets tested every commit by upstream's continuous integration.
When we cut a new zstd release, we will submit a patch to the kernel to update
the zstd version in the kernel.

I've updated zstd to upstream with one big patch because every commit must 
build,
so that precludes partial updates. Since the commit is 100% generated, I hope 
the
review burden is lightened. I considered replaying upstream commits, but that is
not possible because there have been ~3500 upstream commits since the last zstd
import, and the commits don't all build individually. The bulk update preserves
bisectablity because bugs can be bisected to the zstd version update. At that
point the update can be reverted, and we can work with upstream to find and fix
the bug. After this big switch in how the kernel consumes zstd, future patches
will be smaller, because they will only have one upstream release worth of
changes each.

This patchset comes in 3 parts:
1. The first 2 patches prepare for the zstd upgrade. The first patch adds a
   compatibility wrapper so zstd can be upgraded without modifying any callers.
   The second patch adds an indirection for the lib/decompress_unzstd.c 
including
   of all decompression source files.
2. Import zstd-1.4.6. This patch is completely generated from upstream using
   automated tooling.
3. Update all callers to the zstd-1.4.6 API then delete the compatibility
   wrapper.

I tested every caller of zstd on x86_64. I tested both after the 1.4.6 upgrade
using the compatibility wrapper, and after the final patch in this series. I had
problems with F2FS, where I had file truncation both before and after this
series, so I would appreciate help testing it. All other callers were good.

I tested kernel and initramfs decompression in i386 and arm.

I ran benchmarks to compare the current zstd in the kernel with zstd-1.4.6.
I benchmarked on x86_64 using QEMU with KVM enabled on an Intel i9-9900k.
I found:
* BtrFS zstd compression at levels 1 and 3 is 5% faster
* BtrFS zstd decompression+read is 15% faster
* SquashFS zstd decompression+read is 15% faster
* ZRAM decompression+read is 30% faster
* Kernel zstd decompression is 35% faster
* Initramfs zstd decompression+build is 5% faster

The latest zstd also offers bug fixes and a 1 KB reduction in stack uage during
compression.

Please let me know if there is anything that I can do to ease the way for these
patches. I think it is important because it gets large performance improvements,
contains bug fixes, and is switching to a more maintainable model of consuming
upstream zstd directly, making it easy to keep up to date.

Best,
Nick Terrell

Nick Terrell (9):
  lib: zstd: Add zstd compatibility wrapper
  lib: zstd: Add decompress_sources.h for decompress_unzstd
  lib: zstd: Upgrade to latest upstream zstd version 1.4.6
  crypto: zstd: Switch to zstd-1.4.6 API
  btrfs: zstd: Switch to the zstd-1.4.6 API
  f2fs: zstd: Switch to the zstd-1.4.6 API
  squashfs: zstd: Switch to the zstd-1.4.6 API
  lib: unzstd: Switch to the zstd-1.4.6 API
  lib: zstd: Remove zstd compatibility wrapper

 crypto/zstd.c |   22 +-
 fs/btrfs/zstd.c   |   46 +-
 fs/f2fs/compress.c|  100 +-
 fs/squashfs/zstd_wrapper.c|7 +-
 include/linux/zstd.h  | 3019 
 include/linux/zstd_errors.h   |   76 +
 lib/decompress_unzstd.c   |   44 +-
 lib/zstd/Makefile |   35 +-
 lib/zstd/bitstream.h  |  379 --
 lib/zstd/common/bitstream.h   |  437 ++
 lib/zstd/common/compiler.h|  134 +
 lib/zstd/common/cpu.h |  194 +
 lib/zstd/common/debug.c   |   24 +
 lib/zstd/common/debug.h   |  101 +
 lib/zstd/common/entropy_common.c  |  355 ++
 lib/zstd/common

Re: [PATCH] zstd: Fix decompression of large window archives on 32-bit platforms

2020-09-14 Thread Nick Terrell
On Sun, Sep 13, 2020 at 11:19 PM Petr Malat  wrote:
>
> It seems some optimization has been removed from the code without removing
> the if condition which should activate it only on 64-bit platforms and as
> a result the code responsible for decompression with window larger than
> 8MB was disabled on 32-bit platforms.
>
> Signed-off-by: Petr Malat 

Reviewed-by: Nick Terrell 

Thanks for the fix! I looked upstream and this fix corresponds to this
upstream commit:
https://github.com/facebook/zstd/commit/8a5c0c98ae5a7884694589d7a69bc99011add94d

Thanks,
Nick Terrell

> ---
>  lib/zstd/decompress.c | 8 ++--
>  1 file changed, 2 insertions(+), 6 deletions(-)
>
> diff --git a/lib/zstd/decompress.c b/lib/zstd/decompress.c
> index db6761ea4deb..509a3b8d51b9 100644
> --- a/lib/zstd/decompress.c
> +++ b/lib/zstd/decompress.c
> @@ -1457,12 +1457,8 @@ static size_t ZSTD_decompressBlock_internal(ZSTD_DCtx 
> *dctx, void *dst, size_t d
> ip += litCSize;
> srcSize -= litCSize;
> }
> -   if (sizeof(size_t) > 4) /* do not enable prefetching on 32-bits x86, 
> as it's performance detrimental */
> -   /* likely because of register pressure */
> -   /* if that's the correct cause, then 32-bits 
> ARM should be affected differently */
> -   /* it would be good to test this on ARM real 
> hardware, to see if prefetch version improves speed */
> -   if (dctx->fParams.windowSize > (1 << 23))
> -   return ZSTD_decompressSequencesLong(dctx, dst, 
> dstCapacity, ip, srcSize);
> +   if (dctx->fParams.windowSize > (1 << 23))
> +   return ZSTD_decompressSequencesLong(dctx, dst, dstCapacity, 
> ip, srcSize);
> return ZSTD_decompressSequences(dctx, dst, dstCapacity, ip, srcSize);
>  }
>
> --
> 2.20.1
>


Re: [PATCH v3 1/2] lib: decompress_unzstd: Limit output size

2020-09-01 Thread Nick Terrell


> On Sep 1, 2020, at 7:26 AM, Paul Cercueil  wrote:
> 
> The zstd decompression code, as it is right now, will most likely fail
> on 32-bit systems, as the default output buffer size causes the buffer's
> end address to overflow.
> 
> Address this issue by setting a sane default to the default output size,
> with a value that won't overflow the buffer's end address.
> 
> Signed-off-by: Paul Cercueil 
> ---
> 
> Notes:
>v2: Change limit to 1 GiB
> 
>v3: Compute size limit instead of using hardcoded value
> 
> lib/decompress_unzstd.c | 7 ++-
> 1 file changed, 6 insertions(+), 1 deletion(-)
> 
> diff --git a/lib/decompress_unzstd.c b/lib/decompress_unzstd.c
> index 0ad2c15479ed..790abc472f5b 100644
> --- a/lib/decompress_unzstd.c
> +++ b/lib/decompress_unzstd.c
> @@ -178,8 +178,13 @@ static int INIT __unzstd(unsigned char *in_buf, long 
> in_len,
>   int err;
>   size_t ret;
> 
> + /*
> +  * ZSTD decompression code won't be happy if the buffer size is so big
> +  * that its end address overflows. When the size is not provided, make
> +  * it as big as possible without having the end address overflow.
> +  */
>   if (out_len == 0)
> - out_len = LONG_MAX; /* no limit */
> +     out_len = UINTPTR_MAX - (uintptr_t)out_buf;

Great, that works for me. Thanks for fixing this!

Reviewed-by: Nick Terrell 

>   if (fill == NULL && flush == NULL)
>   /*
> -- 
> 2.28.0
> 



Re: [PATCH v2 1/2] lib: decompress_unzstd: Limit output size

2020-08-25 Thread Nick Terrell

> On Aug 25, 2020, at 2:01 PM, Paul Cercueil  wrote:
> 
> The zstd decompression code, as it is right now, will have internal
> values overflow on 32-bit systems when the output size is bigger than
> 1 GiB.
> 
> Until someone smarter than me can figure out how to fix the zstd code
> properly, limit the destination buffer size to 1 GiB, which should be
> enough for everybody, in order to make it usable on 32-bit systems.

I was talking with Yann Collet, and we believe that it isn’t the long that
is overflowing, but the pointers. Zstd expects to be given a valid output
size. It generally uses a begin/end pointer with its output buffer. So when
it is given a very large output size in 32-bit mode the end pointer will
overflow the pointer either causing UB, or end pointer < begin pointer,
which breaks zstd.

Zstd will probably never be able to work properly in this way. A better
solution might be to pass MAX_ADDRESS_PTR - OUTPUT_PTR as
the size to the __decompress() call. Or some other size that won’t
overflow the pointer.

Best,
Nick

> Signed-off-by: Paul Cercueil 
> Reviewed-by: Nick Terrell 
> ---
> 
> Notes:
>v2: Change limit to 1 GiB
> 
> lib/decompress_unzstd.c | 3 ++-
> 1 file changed, 2 insertions(+), 1 deletion(-)
> 
> diff --git a/lib/decompress_unzstd.c b/lib/decompress_unzstd.c
> index 0ad2c15479ed..414517baedb0 100644
> --- a/lib/decompress_unzstd.c
> +++ b/lib/decompress_unzstd.c
> @@ -77,6 +77,7 @@
> 
> #include 
> #include 
> +#include 
> #include 
> 
> /* 128MB is the maximum window size supported by zstd. */
> @@ -179,7 +180,7 @@ static int INIT __unzstd(unsigned char *in_buf, long 
> in_len,
>   size_t ret;
> 
>   if (out_len == 0)
> - out_len = LONG_MAX; /* no limit */
> + out_len = SZ_1G; /* should be big enough, right? */
> 
>   if (fill == NULL && flush == NULL)
>   /*
> -- 
> 2.28.0
> 



Re: [PATCH 1/2] lib: decompress_unzstd: Limit output size

2020-08-24 Thread Nick Terrell


> On Aug 24, 2020, at 2:05 PM, Paul Cercueil  wrote:
> 
> 
> 
> Le lun. 24 août 2020 à 20:11, Nick Terrell  a écrit :
>>> On Aug 21, 2020, at 9:29 AM, Paul Cercueil  wrote:
>>> The zstd decompression code, as it is right now, will have internal
>>> values overflow on 32-bit systems when the output size is LONG_MAX.
>>> Until someone smarter than me can figure out how to fix the zstd code
>>> properly, limit the destination buffer size to 512 MiB, which should be
>>> enough for everybody, in order to make it usable on 32-bit systems.
>> Can you bump the size up to 2GB? I suspect the problem inside of zstd
>> is an off-by-one error or something similar, so getting closer to the limit
>> shouldn't be a problem. I’d feel more comfortable with 2GB, since
>> kernels can get pretty large.
> 
> SZ_1G is the biggest I can go to get the kernel to boot. With SZ_2G it won't 
> boot.

Strange… I don’t quite know what is going on then. Thanks for the fix! You can 
add:

Reviewed-By: Nick Terrell 

Best,
Nick

>> Hmm, zstd shouldn’t be overflowing that value. I’m currently preparing
>> a patch to updating the version of zstd in the kernel, and using upstream
>> directly. I will add a test upstream in 32-bit mode to ensure that we don’t
>> overflow a 32-bit size_t, so this will be fixed after the update.
> 
> Great, thanks.
> 
> Cheers,
> -Paul
> 
>> -Nick
>>> Signed-off-by: Paul Cercueil 
>>> ---
>>> lib/decompress_unzstd.c | 3 ++-
>>> 1 file changed, 2 insertions(+), 1 deletion(-)
>>> diff --git a/lib/decompress_unzstd.c b/lib/decompress_unzstd.c
>>> index 0ad2c15479ed..e1c03b1eaa6e 100644
>>> --- a/lib/decompress_unzstd.c
>>> +++ b/lib/decompress_unzstd.c
>>> @@ -77,6 +77,7 @@
>>> #include 
>>> #include 
>>> +#include 
>>> #include 
>>> /* 128MB is the maximum window size supported by zstd. */
>>> @@ -179,7 +180,7 @@ static int INIT __unzstd(unsigned char *in_buf, long 
>>> in_len,
>>> size_t ret;
>>> if (out_len == 0)
>>> -   out_len = LONG_MAX; /* no limit */
>>> +   out_len = SZ_512M; /* should be big enough, right? */
>>> if (fill == NULL && flush == NULL)
>>> /*
>>> --
>>> 2.28.0
> 
> 



Re: [PATCH 2/2] MIPS: Add support for ZSTD-compressed kernels

2020-08-24 Thread Nick Terrell


> On Aug 24, 2020, at 2:02 PM, Paul Cercueil  wrote:
> 
> Hi Nick,
> 
> Le lun. 24 août 2020 à 19:51, Nick Terrell  a écrit :
>>> On Aug 21, 2020, at 9:29 AM, Paul Cercueil  wrote:
>>> Add support for self-extracting kernels with a ZSTD compression.
>>> Tested on a kernel for the GCW-Zero, it allows to reduce the size of the
>>> kernel file from 4.1 MiB with gzip to 3.5 MiB with ZSTD, and boots just
>>> as fast.
>>> Signed-off-by: Paul Cercueil 
>>> ---
>>> arch/mips/Kconfig  |  1 +
>>> arch/mips/boot/compressed/Makefile |  1 +
>>> arch/mips/boot/compressed/decompress.c |  4 
>>> arch/mips/boot/compressed/string.c | 16 
>>> 4 files changed, 22 insertions(+)
>>> diff --git a/arch/mips/Kconfig b/arch/mips/Kconfig
>>> index c95fa3a2484c..b9d7c4249dc9 100644
>>> --- a/arch/mips/Kconfig
>>> +++ b/arch/mips/Kconfig
>>> @@ -1890,6 +1890,7 @@ config SYS_SUPPORTS_ZBOOT
>>> select HAVE_KERNEL_LZMA
>>> select HAVE_KERNEL_LZO
>>> select HAVE_KERNEL_XZ
>>> +   select HAVE_KERNEL_ZSTD
>>> config SYS_SUPPORTS_ZBOOT_UART16550
>>> bool
>>> diff --git a/arch/mips/boot/compressed/Makefile 
>>> b/arch/mips/boot/compressed/Makefile
>>> index 6e56caef69f0..86ddc6fc16f4 100644
>>> --- a/arch/mips/boot/compressed/Makefile
>>> +++ b/arch/mips/boot/compressed/Makefile
>>> @@ -70,6 +70,7 @@ tool_$(CONFIG_KERNEL_LZ4) = lz4
>>> tool_$(CONFIG_KERNEL_LZMA)= lzma
>>> tool_$(CONFIG_KERNEL_LZO) = lzo
>>> tool_$(CONFIG_KERNEL_XZ)  = xzkern
>>> +tool_$(CONFIG_KERNEL_ZSTD)= zstd
>> You can use zstd22 here. It will give you slightly better compression
>> without any additional memory usage. Also, you should add
>> -D__DISABLE_EXPORTS to the KBUILD_CFLAGS like x86 does [1].
> 
> Indeed, it's 0.01% smaller :)
> 
> What is __DISABLE_EXPORTS for?

It disables the EXPORT_SYMBOL() macros inside of lib/zstd/decompress.c.
On x86 the kernel won’t boot with these defined. Other decompressors hide
them if the STATIC macro is defined, but zstd uses this method, which was
added somewhat recently.

-Nick

> -Paul
> 
>> [1] 
>> https://github.com/torvalds/linux/blob/master/arch/x86/boot/compressed/Makefile
>> -Nick
>>> targets += vmlinux.bin.z
>>> $(obj)/vmlinux.bin.z: $(obj)/vmlinux.bin FORCE
>>> diff --git a/arch/mips/boot/compressed/decompress.c 
>>> b/arch/mips/boot/compressed/decompress.c
>>> index 88f5d637b1c4..c61c641674e6 100644
>>> --- a/arch/mips/boot/compressed/decompress.c
>>> +++ b/arch/mips/boot/compressed/decompress.c
>>> @@ -72,6 +72,10 @@ void error(char *x)
>>> #include "../../../../lib/decompress_unxz.c"
>>> #endif
>>> +#ifdef CONFIG_KERNEL_ZSTD
>>> +#include "../../../../lib/decompress_unzstd.c"
>>> +#endif
>>> +
>>> const unsigned long __stack_chk_guard = 0x000a0dff;
>>> void __stack_chk_fail(void)
>>> diff --git a/arch/mips/boot/compressed/string.c 
>>> b/arch/mips/boot/compressed/string.c
>>> index 43beecc3587c..ab95722ec0c9 100644
>>> --- a/arch/mips/boot/compressed/string.c
>>> +++ b/arch/mips/boot/compressed/string.c
>>> @@ -27,3 +27,19 @@ void *memset(void *s, int c, size_t n)
>>> ss[i] = c;
>>> return s;
>>> }
>>> +
>>> +void *memmove(void *dest, const void *src, size_t n)
>>> +{
>>> +   unsigned int i;
>>> +   const char *s = src;
>>> +   char *d = dest;
>>> +
>>> +   if ((uintptr_t)dest < (uintptr_t)src) {
>>> +   for (i = 0; i < n; i++)
>>> +   d[i] = s[i];
>>> +   } else {
>>> +   for (i = n; i > 0; i--)
>>> +   d[i - 1] = s[i - 1];
>>> +   }
>>> +   return dest;
>>> +}
>>> --
>>> 2.28.0



  1   2   3   >