Re: [GIT PULL][PATCH v9 0/3] Update to zstd-1.4.10
On Wed, Apr 14, 2021 at 12:04 PM Eric Biggers wrote: > > On Wed, Apr 14, 2021 at 11:53:51AM -0700, Nick Terrell wrote: > > On Wed, Apr 14, 2021 at 11:35 AM Eric Biggers wrote: > > > > > > On Wed, Apr 14, 2021 at 11:01:29AM -0700, Nick Terrell wrote: > > > > Hi all, > > > > > > > > I would really like to make some progress on this and get it merged. > > > > This patchset offsers: > > > > * 15-30% better decompression speed > > > > * 3 years of zstd bug fixes and code improvements > > > > * Allows us to import zstd directly from upstream so we don't fall 3 > > > > years out of date again > > > > > > > > Thanks, > > > > Nick > > > > > > > > > > I think it would help get it merged if someone actually volunteered to > > > maintain > > > it. As-is there is no entry in MAINTAINERS for this code. > > > > I was discussing with Chris Mason about volunteering to maintain the > > code myself. > > We wanted to wait until this series got merged before going that > > route, because there > > was already a lot of comments about it, and I didn't want to appear to > > be trying to bypass > > any review or criticisms. But, please let me know what you think. > > > > I expect that most people would like to see a commitment to maintain this code > before merging. The usual way to do that is to add a MAINTAINERS entry. > > Otherwise it is 27000 lines of code dumped on other people to maintain. I will add a 4th patch in the series to update the MAINTAINERS. > - Eric
Re: [GIT PULL][PATCH v9 0/3] Update to zstd-1.4.10
On Wed, Apr 14, 2021 at 11:35 AM Eric Biggers wrote: > > On Wed, Apr 14, 2021 at 11:01:29AM -0700, Nick Terrell wrote: > > Hi all, > > > > I would really like to make some progress on this and get it merged. > > This patchset offsers: > > * 15-30% better decompression speed > > * 3 years of zstd bug fixes and code improvements > > * Allows us to import zstd directly from upstream so we don't fall 3 > > years out of date again > > > > Thanks, > > Nick > > > > I think it would help get it merged if someone actually volunteered to > maintain > it. As-is there is no entry in MAINTAINERS for this code. I was discussing with Chris Mason about volunteering to maintain the code myself. We wanted to wait until this series got merged before going that route, because there was already a lot of comments about it, and I didn't want to appear to be trying to bypass any review or criticisms. But, please let me know what you think. Best, Nick > - Eric
Re: [GIT PULL][PATCH v9 0/3] Update to zstd-1.4.10
Hi all, I would really like to make some progress on this and get it merged. This patchset offsers: * 15-30% better decompression speed * 3 years of zstd bug fixes and code improvements * Allows us to import zstd directly from upstream so we don't fall 3 years out of date again Thanks, Nick On Fri, Apr 9, 2021 at 2:39 PM Nick Terrell wrote: > > What can I do to help get this merged? > > Cristoph, is this new patch series with the kernel wrapper API satisfactory? > > Best, > Nick > > On Tue, Mar 30, 2021 at 3:45 PM Nick Terrell wrote: > > > > From: Nick Terrell > > > > Please pull from > > > > g...@github.com:terrelln/linux.git tags/v9-zstd-1.4.10 > > > > to get these changes. Alternatively the patchset is included. > > > > This patchset upgrades the zstd library to the latest upstream release. The > > current zstd version in the kernel is a modified version of upstream > > zstd-1.3.1. > > At the time it was integrated, zstd wasn't ready to be used in the kernel > > as-is. > > But, it is now possible to use upstream zstd directly in the kernel. > > > > I have not yet released zstd-1.4.10 upstream. I want the zstd version in the > > kernel to match up with a known upstream release, so we know exactly what > > code > > is running. Whenever this patchset is ready for merge, I will cut a release > > at > > the upstream commit that gets merged. This should not be necessary for > > future > > releases. > > > > The kernel zstd library is automatically generated from upstream zstd. A > > script > > makes the necessary changes and imports it into the kernel. The changes are: > > > > 1. Replace all libc dependencies with kernel replacements and rewrite > > includes. > > 2. Remove unncessary portability macros like: #if defined(_MSC_VER). > > 3. Use the kernel xxhash instead of bundling it. > > > > This automation gets tested every commit by upstream's continuous > > integration. > > When we cut a new zstd release, we will submit a patch to the kernel to > > update > > the zstd version in the kernel. > > > > I've updated zstd to upstream with one big patch because every commit must > > build, > > so that precludes partial updates. Since the commit is 100% generated, I > > hope the > > review burden is lightened. I considered replaying upstream commits, but > > that is > > not possible because there have been ~3500 upstream commits since the last > > zstd > > import, and the commits don't all build individually. The bulk update > > preserves > > bisectablity because bugs can be bisected to the zstd version update. At > > that > > point the update can be reverted, and we can work with upstream to find and > > fix > > the bug. After this big switch in how the kernel consumes zstd, future > > patches > > will be smaller, because they will only have one upstream release worth of > > changes each. > > > > This patchset adds a new kernel-style wrapper around zstd. This wrapper API > > is > > functionally equivalent to the subset of the current zstd API that is > > currently > > used. The wrapper API changes to be kernel style so that the symbols don't > > collide with zstd's symbols. The update to zstd-1.4.6 maintains the same API > > and preserves the semantics, so that none of the callers need to be updated. > > > > This patchset comes in 2 parts: > > 1. The first 2 patches prepare for the zstd upgrade. The first patch adds > > the > >new kernel style API so zstd can be upgraded without modifying any > > callers. > >The second patch adds an indirection for the lib/decompress_unzstd.c > >including of all decompression source files. > > 2. Import zstd-1.4.10. This patch is completely generated from upstream > > using > >automated tooling. > > > > I tested every caller of zstd on x86_64. I tested both after the 1.4.10 > > upgrade > > using the compatibility wrapper, and after the final patch in this series. > > > > I tested kernel and initramfs decompression in i386 and arm. > > > > I ran benchmarks to compare the current zstd in the kernel with zstd-1.4.6. > > I benchmarked on x86_64 using QEMU with KVM enabled on an Intel i9-9900k. > > I found: > > * BtrFS zstd compression at levels 1 and 3 is 5% faster > > * BtrFS zstd decompression+read is 15% faster > > * SquashFS zstd decompression+read is 15% faster > > * F2FS zstd compression+write at level 3 is 8% faster > > * F2FS zstd decompression+read is 20
Re: [PATCH -next] lib: zstd: Make symbol 'HUF_compressWeights_wksp' static
> On Apr 8, 2021, at 8:09 PM, Miguel Ojeda > wrote: > > On Fri, Apr 9, 2021 at 2:20 AM Nick Desaulniers > wrote: >> >> Quite a few other functions are declared in a header, but I don't see >> any existing callers in tree. I wonder if the maintainer could >> consider cleaning these up so that we don't retain them in binaries >> without dead code elimination enabled, or if there's a need to keep >> this code in line with an external upstream codebase? > > Yeah, the equivalent cleanup was done upstream by Nick in 2018 [1], > but there has been no major update to lib/zstd since 2017. > > Thus a cleanup would actually make it closer to upstream, which is the > best case scenario :) > >Reviewed-by: Miguel Ojeda > > [1] > https://github.com/facebook/zstd/commit/f2d6db45cd28457fa08467416e8535985f062859 This looks good to me as well. I have a patchset up to use upstream zstd directly in the kernel [0]. That will allow us to keep zstd up to date. And after that lands, I hope to set up a zstd linux tree to make merging patches into lib/zstd easier, since over the years quite a few have been ignored. [0] https://www.mail-archive.com/linux-kernel@vger.kernel.org/msg2532407.html Best, Nick Terrell > Cheers, > Miguel
Re: [GIT PULL][PATCH v9 0/3] Update to zstd-1.4.10
What can I do to help get this merged? Cristoph, is this new patch series with the kernel wrapper API satisfactory? Best, Nick On Tue, Mar 30, 2021 at 3:45 PM Nick Terrell wrote: > > From: Nick Terrell > > Please pull from > > g...@github.com:terrelln/linux.git tags/v9-zstd-1.4.10 > > to get these changes. Alternatively the patchset is included. > > This patchset upgrades the zstd library to the latest upstream release. The > current zstd version in the kernel is a modified version of upstream > zstd-1.3.1. > At the time it was integrated, zstd wasn't ready to be used in the kernel > as-is. > But, it is now possible to use upstream zstd directly in the kernel. > > I have not yet released zstd-1.4.10 upstream. I want the zstd version in the > kernel to match up with a known upstream release, so we know exactly what code > is running. Whenever this patchset is ready for merge, I will cut a release at > the upstream commit that gets merged. This should not be necessary for future > releases. > > The kernel zstd library is automatically generated from upstream zstd. A > script > makes the necessary changes and imports it into the kernel. The changes are: > > 1. Replace all libc dependencies with kernel replacements and rewrite > includes. > 2. Remove unncessary portability macros like: #if defined(_MSC_VER). > 3. Use the kernel xxhash instead of bundling it. > > This automation gets tested every commit by upstream's continuous integration. > When we cut a new zstd release, we will submit a patch to the kernel to update > the zstd version in the kernel. > > I've updated zstd to upstream with one big patch because every commit must > build, > so that precludes partial updates. Since the commit is 100% generated, I hope > the > review burden is lightened. I considered replaying upstream commits, but that > is > not possible because there have been ~3500 upstream commits since the last > zstd > import, and the commits don't all build individually. The bulk update > preserves > bisectablity because bugs can be bisected to the zstd version update. At that > point the update can be reverted, and we can work with upstream to find and > fix > the bug. After this big switch in how the kernel consumes zstd, future patches > will be smaller, because they will only have one upstream release worth of > changes each. > > This patchset adds a new kernel-style wrapper around zstd. This wrapper API is > functionally equivalent to the subset of the current zstd API that is > currently > used. The wrapper API changes to be kernel style so that the symbols don't > collide with zstd's symbols. The update to zstd-1.4.6 maintains the same API > and preserves the semantics, so that none of the callers need to be updated. > > This patchset comes in 2 parts: > 1. The first 2 patches prepare for the zstd upgrade. The first patch adds the >new kernel style API so zstd can be upgraded without modifying any callers. >The second patch adds an indirection for the lib/decompress_unzstd.c >including of all decompression source files. > 2. Import zstd-1.4.10. This patch is completely generated from upstream using >automated tooling. > > I tested every caller of zstd on x86_64. I tested both after the 1.4.10 > upgrade > using the compatibility wrapper, and after the final patch in this series. > > I tested kernel and initramfs decompression in i386 and arm. > > I ran benchmarks to compare the current zstd in the kernel with zstd-1.4.6. > I benchmarked on x86_64 using QEMU with KVM enabled on an Intel i9-9900k. > I found: > * BtrFS zstd compression at levels 1 and 3 is 5% faster > * BtrFS zstd decompression+read is 15% faster > * SquashFS zstd decompression+read is 15% faster > * F2FS zstd compression+write at level 3 is 8% faster > * F2FS zstd decompression+read is 20% faster > * ZRAM decompression+read is 30% faster > * Kernel zstd decompression is 35% faster > * Initramfs zstd decompression+build is 5% faster > > The latest zstd also offers bug fixes. For example the problem with large > kernel > decompression has been fixed upstream for over 2 years > https://lkml.org/lkml/2020/9/29/27. > > Please let me know if there is anything that I can do to ease the way for > these > patches. I think it is important because it gets large performance > improvements, > contains bug fixes, and is switching to a more maintainable model of consuming > upstream zstd directly, making it easy to keep up to date. > > Best, > Nick Terrell > > v1 -> v2: > * Successfully tested F2FS with help from Chao Yu to fix my test. > * (1/9) Fix ZSTD_initCStream() wrapper to handle pledged_src_size=0 means >
Re: [PATCH] init: add support for zstd compressed modules
> On Apr 7, 2021, at 6:53 AM, Masahiro Yamada wrote: > > On Thu, Apr 1, 2021 at 4:21 AM Nick Terrell wrote: >> >> >> >>> On Mar 31, 2021, at 10:48 AM, Oleksandr Natalenko >>> wrote: >>> >>> Hello. >>> >>> On Wed, Mar 31, 2021 at 05:39:25PM +, Nick Terrell wrote: >>>> >>>> >>>>> On Mar 30, 2021, at 4:50 AM, Oleksandr Natalenko >>>>> wrote: >>>>> >>>>> On Tue, Mar 30, 2021 at 01:32:35PM +0200, Piotr Gorski wrote: >>>>>> kmod 28 supports modules compressed in zstd format so let's add this >>>>>> possibility to kernel. >>>>>> >>>>>> Signed-off-by: Piotr Gorski >>>>>> --- >>>>>> Makefile | 7 +-- >>>>>> init/Kconfig | 9 ++--- >>>>>> 2 files changed, 11 insertions(+), 5 deletions(-) >>>>>> >>>>>> diff --git a/Makefile b/Makefile >>>>>> index 5160ff8903c1..82f4f4cc2955 100644 >>>>>> --- a/Makefile >>>>>> +++ b/Makefile >>>>>> @@ -1156,8 +1156,8 @@ endif # INSTALL_MOD_STRIP >>>>>> export mod_strip_cmd >>>>>> >>>>>> # CONFIG_MODULE_COMPRESS, if defined, will cause module to be compressed >>>>>> -# after they are installed in agreement with CONFIG_MODULE_COMPRESS_GZIP >>>>>> -# or CONFIG_MODULE_COMPRESS_XZ. >>>>>> +# after they are installed in agreement with >>>>>> CONFIG_MODULE_COMPRESS_GZIP, >>>>>> +# CONFIG_MODULE_COMPRESS_XZ, or CONFIG_MODULE_COMPRESS_ZSTD. >>>>>> >>>>>> mod_compress_cmd = true >>>>>> ifdef CONFIG_MODULE_COMPRESS >>>>>> @@ -1167,6 +1167,9 @@ ifdef CONFIG_MODULE_COMPRESS >>>>>> ifdef CONFIG_MODULE_COMPRESS_XZ >>>>>> mod_compress_cmd = $(XZ) --lzma2=dict=2MiB -f >>>>>> endif # CONFIG_MODULE_COMPRESS_XZ >>>>>> + ifdef CONFIG_MODULE_COMPRESS_ZSTD >>>>>> +mod_compress_cmd = $(ZSTD) -T0 --rm -f -q >>>> >>>> This will use the default zstd level, level 3. I think it would make more >>>> sense to use a high >>>> compression level. Level 19 would probably be a good choice. That will >>>> choose a window >>>> size of up to 8MB, meaning the decompressor needs to allocate that much >>>> memory. If that >>>> is unacceptable, you could use `zstd -T0 --rm -f -q -19 --zstd=wlog=21`, >>>> which will use a >>>> window size of up to 2MB, to match the XZ command. Note that if the file >>>> is smaller than >>>> the window size, it will be shrunk to the smallest power of two at least >>>> as large as the file. >>> >>> Please no. We've already done that with initramfs in Arch, and it >>> increased the time to generate it enormously. >>> >>> I understand that building a kernel is a more rare operation than >>> regenerating initramfs, but still I'd go against hard-coding the level. >>> And if it should be specified anyway, I'd opt in for an explicit >>> configuration option. Remember, not all the kernel are built on >>> build farms... >>> >>> FWIW, Piotr originally used level 9 which worked okay, but I insisted >>> on sending the patch initially without specifying level at all like it is >>> done for other compressors. If this is a wrong approach, then oh meh, >>> mea culpa ;). >>> >>> Whatever default non-standard compression level you choose, I'm fine >>> as long as I can change it without editing Makefile. >> >> That makes sense to me. I have a deep seated need to compress files as >> efficiently as possible for widely distributed packages. But, I understand >> that >> slow compression significantly impacts build times for quick iteration. I’d >> be >> happy with a compression level parameter that defaults to a happy middle. >> >> I’m also fine with taking this patch as-is if it is easier, and I can put up >> another >> patch that adds a compression level parameter, since I don’t want to block >> merging this. > > > I do not want to take such a patch. > Meeking everyone's requirement > results in a bad project for everyone. > > > Does this work for you? > > make modules_install ZSTD="zstd -19" Yeah, that’s perfect. Do y
Re: [PATCH] init: add support for zstd compressed modules
> On Apr 1, 2021, at 12:54 AM, torv...@mailbox.org wrote: > > Thanks Piotr, good work! > Question: Is `-T0` really faster in this particular case than the default > `-T1`? Are modules installed sequentially? The zstd CLI produces deterministic output regardless of the number of threads used. `-T1` (or not specifying `-T`) will produce the same output as `-T0`. `-T0` will be faster for large files (at the default level, multiple jobs will be spawned for files > 8MB), and be just as fast as `-T1` for smaller files. Best, Nick > I also saw that Masahiro did some work on modules_install, moving > MODULE_COMPRESS from the base Makefile to scripts/Makefile.modinst, so > perhaps this should also be moved there at a later point. > > Tor Vic
Re: [PATCH] init: add support for zstd compressed modules
> On Mar 31, 2021, at 10:48 AM, Oleksandr Natalenko > wrote: > > Hello. > > On Wed, Mar 31, 2021 at 05:39:25PM +, Nick Terrell wrote: >> >> >>> On Mar 30, 2021, at 4:50 AM, Oleksandr Natalenko >>> wrote: >>> >>> On Tue, Mar 30, 2021 at 01:32:35PM +0200, Piotr Gorski wrote: >>>> kmod 28 supports modules compressed in zstd format so let's add this >>>> possibility to kernel. >>>> >>>> Signed-off-by: Piotr Gorski >>>> --- >>>> Makefile | 7 +-- >>>> init/Kconfig | 9 ++--- >>>> 2 files changed, 11 insertions(+), 5 deletions(-) >>>> >>>> diff --git a/Makefile b/Makefile >>>> index 5160ff8903c1..82f4f4cc2955 100644 >>>> --- a/Makefile >>>> +++ b/Makefile >>>> @@ -1156,8 +1156,8 @@ endif # INSTALL_MOD_STRIP >>>> export mod_strip_cmd >>>> >>>> # CONFIG_MODULE_COMPRESS, if defined, will cause module to be compressed >>>> -# after they are installed in agreement with CONFIG_MODULE_COMPRESS_GZIP >>>> -# or CONFIG_MODULE_COMPRESS_XZ. >>>> +# after they are installed in agreement with CONFIG_MODULE_COMPRESS_GZIP, >>>> +# CONFIG_MODULE_COMPRESS_XZ, or CONFIG_MODULE_COMPRESS_ZSTD. >>>> >>>> mod_compress_cmd = true >>>> ifdef CONFIG_MODULE_COMPRESS >>>> @@ -1167,6 +1167,9 @@ ifdef CONFIG_MODULE_COMPRESS >>>> ifdef CONFIG_MODULE_COMPRESS_XZ >>>>mod_compress_cmd = $(XZ) --lzma2=dict=2MiB -f >>>> endif # CONFIG_MODULE_COMPRESS_XZ >>>> + ifdef CONFIG_MODULE_COMPRESS_ZSTD >>>> +mod_compress_cmd = $(ZSTD) -T0 --rm -f -q >> >> This will use the default zstd level, level 3. I think it would make more >> sense to use a high >> compression level. Level 19 would probably be a good choice. That will >> choose a window >> size of up to 8MB, meaning the decompressor needs to allocate that much >> memory. If that >> is unacceptable, you could use `zstd -T0 --rm -f -q -19 --zstd=wlog=21`, >> which will use a >> window size of up to 2MB, to match the XZ command. Note that if the file is >> smaller than >> the window size, it will be shrunk to the smallest power of two at least as >> large as the file. > > Please no. We've already done that with initramfs in Arch, and it > increased the time to generate it enormously. > > I understand that building a kernel is a more rare operation than > regenerating initramfs, but still I'd go against hard-coding the level. > And if it should be specified anyway, I'd opt in for an explicit > configuration option. Remember, not all the kernel are built on > build farms... > > FWIW, Piotr originally used level 9 which worked okay, but I insisted > on sending the patch initially without specifying level at all like it is > done for other compressors. If this is a wrong approach, then oh meh, > mea culpa ;). > > Whatever default non-standard compression level you choose, I'm fine > as long as I can change it without editing Makefile. That makes sense to me. I have a deep seated need to compress files as efficiently as possible for widely distributed packages. But, I understand that slow compression significantly impacts build times for quick iteration. I’d be happy with a compression level parameter that defaults to a happy middle. I’m also fine with taking this patch as-is if it is easier, and I can put up another patch that adds a compression level parameter, since I don’t want to block merging this. Best, Nick Terrell > Thanks! > >> >> Best, >> Nick Terrell >> >>>> + endif # CONFIG_MODULE_COMPRESS_ZSTD >>>> endif # CONFIG_MODULE_COMPRESS >>>> export mod_compress_cmd >>>> >>>> diff --git a/init/Kconfig b/init/Kconfig >>>> index 8c2cfd88f6ef..86a452bc2747 100644 >>>> --- a/init/Kconfig >>>> +++ b/init/Kconfig >>>> @@ -2250,8 +2250,8 @@ config MODULE_COMPRESS >>>>bool "Compress modules on installation" >>>>help >>>> >>>> -Compresses kernel modules when 'make modules_install' is run; gzip or >>>> -xz depending on "Compression algorithm" below. >>>> +Compresses kernel modules when 'make modules_install' is run; gzip, >>>> +xz, or zstd depending on "Compression algorithm" below. >>>> >>>> module-init-tools MAY support gzip, and kmod MAY support gzip and xz. >>>> >>>> @@ -2
Re: [PATCH] init: add support for zstd compressed modules
> On Mar 30, 2021, at 4:50 AM, Oleksandr Natalenko > wrote: > > On Tue, Mar 30, 2021 at 01:32:35PM +0200, Piotr Gorski wrote: >> kmod 28 supports modules compressed in zstd format so let's add this >> possibility to kernel. >> >> Signed-off-by: Piotr Gorski >> --- >> Makefile | 7 +-- >> init/Kconfig | 9 ++--- >> 2 files changed, 11 insertions(+), 5 deletions(-) >> >> diff --git a/Makefile b/Makefile >> index 5160ff8903c1..82f4f4cc2955 100644 >> --- a/Makefile >> +++ b/Makefile >> @@ -1156,8 +1156,8 @@ endif # INSTALL_MOD_STRIP >> export mod_strip_cmd >> >> # CONFIG_MODULE_COMPRESS, if defined, will cause module to be compressed >> -# after they are installed in agreement with CONFIG_MODULE_COMPRESS_GZIP >> -# or CONFIG_MODULE_COMPRESS_XZ. >> +# after they are installed in agreement with CONFIG_MODULE_COMPRESS_GZIP, >> +# CONFIG_MODULE_COMPRESS_XZ, or CONFIG_MODULE_COMPRESS_ZSTD. >> >> mod_compress_cmd = true >> ifdef CONFIG_MODULE_COMPRESS >> @@ -1167,6 +1167,9 @@ ifdef CONFIG_MODULE_COMPRESS >> ifdef CONFIG_MODULE_COMPRESS_XZ >> mod_compress_cmd = $(XZ) --lzma2=dict=2MiB -f >> endif # CONFIG_MODULE_COMPRESS_XZ >> + ifdef CONFIG_MODULE_COMPRESS_ZSTD >> +mod_compress_cmd = $(ZSTD) -T0 --rm -f -q This will use the default zstd level, level 3. I think it would make more sense to use a high compression level. Level 19 would probably be a good choice. That will choose a window size of up to 8MB, meaning the decompressor needs to allocate that much memory. If that is unacceptable, you could use `zstd -T0 --rm -f -q -19 --zstd=wlog=21`, which will use a window size of up to 2MB, to match the XZ command. Note that if the file is smaller than the window size, it will be shrunk to the smallest power of two at least as large as the file. Best, Nick Terrell >> + endif # CONFIG_MODULE_COMPRESS_ZSTD >> endif # CONFIG_MODULE_COMPRESS >> export mod_compress_cmd >> >> diff --git a/init/Kconfig b/init/Kconfig >> index 8c2cfd88f6ef..86a452bc2747 100644 >> --- a/init/Kconfig >> +++ b/init/Kconfig >> @@ -2250,8 +2250,8 @@ config MODULE_COMPRESS >> bool "Compress modules on installation" >> help >> >> - Compresses kernel modules when 'make modules_install' is run; gzip or >> - xz depending on "Compression algorithm" below. >> + Compresses kernel modules when 'make modules_install' is run; gzip, >> + xz, or zstd depending on "Compression algorithm" below. >> >>module-init-tools MAY support gzip, and kmod MAY support gzip and xz. >> >> @@ -2273,7 +2273,7 @@ choice >>This determines which sort of compression will be used during >>'make modules_install'. >> >> - GZIP (default) and XZ are supported. >> + GZIP (default), XZ, and ZSTD are supported. >> >> config MODULE_COMPRESS_GZIP >> bool "GZIP" >> @@ -2281,6 +2281,9 @@ config MODULE_COMPRESS_GZIP >> config MODULE_COMPRESS_XZ >> bool "XZ" >> >> +config MODULE_COMPRESS_ZSTD >> +bool "ZSTD" >> + >> endchoice >> >> config MODULE_ALLOW_MISSING_NAMESPACE_IMPORTS >> -- >> 2.31.0.97.g1424303384 >> > > Great! > > Reviewed-by: Oleksandr Natalenko > > This works perfectly fine in Arch Linux if accompanied by the > following mkinitcpio amendment: [1]. > > I'm also Cc'ing other people from get_maintainers output just > to make this submission more visible. > > Thanks. > > [1] https://github.com/archlinux/mkinitcpio/pull/43 > > -- > Oleksandr Natalenko (post-factum)
[PATCH v9 2/3] lib: zstd: Add decompress_sources.h for decompress_unzstd
From: Nick Terrell Adds decompress_sources.h which includes every .c file necessary for zstd decompression. This is used in decompress_unzstd.c so the internal structure of the library isn't exposed. This allows us to upgrade the zstd library version without modifying any callers. Instead we just need to update decompress_sources.h. Signed-off-by: Nick Terrell --- lib/decompress_unzstd.c | 6 +- lib/zstd/decompress_sources.h | 23 +++ 2 files changed, 24 insertions(+), 5 deletions(-) create mode 100644 lib/zstd/decompress_sources.h diff --git a/lib/decompress_unzstd.c b/lib/decompress_unzstd.c index c88aad49e996..6e5ecfba0a8d 100644 --- a/lib/decompress_unzstd.c +++ b/lib/decompress_unzstd.c @@ -68,11 +68,7 @@ #ifdef STATIC # define UNZSTD_PREBOOT # include "xxhash.c" -# include "zstd/entropy_common.c" -# include "zstd/fse_decompress.c" -# include "zstd/huf_decompress.c" -# include "zstd/zstd_common.c" -# include "zstd/decompress.c" +# include "zstd/decompress_sources.h" #endif #include diff --git a/lib/zstd/decompress_sources.h b/lib/zstd/decompress_sources.h new file mode 100644 index ..d82cea4316f5 --- /dev/null +++ b/lib/zstd/decompress_sources.h @@ -0,0 +1,23 @@ +/* SPDX-License-Identifier: GPL-2.0-only */ +/* + * Copyright (c) Facebook, Inc. + * All rights reserved. + * + * This source code is licensed under both the BSD-style license (found in the + * LICENSE file in the root directory of this source tree) and the GPLv2 (found + * in the COPYING file in the root directory of this source tree). + * You may select, at your option, one of the above-listed licenses. + */ + +/* + * This file includes every .c file needed for decompression. + * It is used by lib/decompress_unzstd.c to include the decompression + * source into the translation-unit, so it can be used for kernel + * decompression. + */ + +#include "entropy_common.c" +#include "fse_decompress.c" +#include "huf_decompress.c" +#include "zstd_common.c" +#include "decompress.c" -- 2.31.0
[PATCH v9 1/3] lib: zstd: Add kernel-specific API
From: Nick Terrell This patch: - Moves `include/linux/zstd.h` -> `include/linux/zstd_lib.h` - Updates modified zstd headers to yearless copyright - Adds a new API in `include/linux/zstd.h` that is functionally equivalent to the in-use subset of the current API. Functions are renamed to avoid symbol collisions with zstd, to make it clear it is not the upstream zstd API, and to follow the kernel style guide. - Updates all callers to use the new API. There are no functional changes in this patch. Since there are no functional change, I felt it was okay to update all the callers in a single patch. Once the API is approved, the callers are mechanically changed. This patch is preparing for the 3rd patch in this series, which updates zstd to version 1.4.10. Since the upstream zstd API is no longer exposed to callers, the update can happen transparently. Signed-off-by: Nick Terrell --- crypto/zstd.c | 28 +- fs/btrfs/zstd.c| 68 +- fs/f2fs/compress.c | 56 +- fs/f2fs/super.c|2 +- fs/pstore/platform.c |2 +- fs/squashfs/zstd_wrapper.c | 16 +- include/linux/zstd.h | 1243 include/linux/zstd_lib.h | 1157 + lib/decompress_unzstd.c| 42 +- lib/zstd/compress.c| 123 ++-- lib/zstd/decompress.c | 112 ++-- 11 files changed, 1691 insertions(+), 1158 deletions(-) create mode 100644 include/linux/zstd_lib.h diff --git a/crypto/zstd.c b/crypto/zstd.c index 1a3309f066f7..154a969c83a8 100644 --- a/crypto/zstd.c +++ b/crypto/zstd.c @@ -18,22 +18,22 @@ #define ZSTD_DEF_LEVEL 3 struct zstd_ctx { - ZSTD_CCtx *cctx; - ZSTD_DCtx *dctx; + zstd_cctx *cctx; + zstd_dctx *dctx; void *cwksp; void *dwksp; }; -static ZSTD_parameters zstd_params(void) +static zstd_parameters zstd_params(void) { - return ZSTD_getParams(ZSTD_DEF_LEVEL, 0, 0); + return zstd_get_params(ZSTD_DEF_LEVEL, 0); } static int zstd_comp_init(struct zstd_ctx *ctx) { int ret = 0; - const ZSTD_parameters params = zstd_params(); - const size_t wksp_size = ZSTD_CCtxWorkspaceBound(params.cParams); + const zstd_parameters params = zstd_params(); + const size_t wksp_size = zstd_cctx_workspace_bound(); ctx->cwksp = vzalloc(wksp_size); if (!ctx->cwksp) { @@ -41,7 +41,7 @@ static int zstd_comp_init(struct zstd_ctx *ctx) goto out; } - ctx->cctx = ZSTD_initCCtx(ctx->cwksp, wksp_size); + ctx->cctx = zstd_init_cctx(ctx->cwksp, wksp_size); if (!ctx->cctx) { ret = -EINVAL; goto out_free; @@ -56,7 +56,7 @@ static int zstd_comp_init(struct zstd_ctx *ctx) static int zstd_decomp_init(struct zstd_ctx *ctx) { int ret = 0; - const size_t wksp_size = ZSTD_DCtxWorkspaceBound(); + const size_t wksp_size = zstd_dctx_workspace_bound(); ctx->dwksp = vzalloc(wksp_size); if (!ctx->dwksp) { @@ -64,7 +64,7 @@ static int zstd_decomp_init(struct zstd_ctx *ctx) goto out; } - ctx->dctx = ZSTD_initDCtx(ctx->dwksp, wksp_size); + ctx->dctx = zstd_init_dctx(ctx->dwksp, wksp_size); if (!ctx->dctx) { ret = -EINVAL; goto out_free; @@ -152,10 +152,10 @@ static int __zstd_compress(const u8 *src, unsigned int slen, { size_t out_len; struct zstd_ctx *zctx = ctx; - const ZSTD_parameters params = zstd_params(); + const zstd_parameters params = zstd_params(); - out_len = ZSTD_compressCCtx(zctx->cctx, dst, *dlen, src, slen, params); - if (ZSTD_isError(out_len)) + out_len = zstd_compress_cctx(zctx->cctx, dst, *dlen, src, slen, ); + if (zstd_is_error(out_len)) return -EINVAL; *dlen = out_len; return 0; @@ -182,8 +182,8 @@ static int __zstd_decompress(const u8 *src, unsigned int slen, size_t out_len; struct zstd_ctx *zctx = ctx; - out_len = ZSTD_decompressDCtx(zctx->dctx, dst, *dlen, src, slen); - if (ZSTD_isError(out_len)) + out_len = zstd_decompress_dctx(zctx->dctx, dst, *dlen, src, slen); + if (zstd_is_error(out_len)) return -EINVAL; *dlen = out_len; return 0; diff --git a/fs/btrfs/zstd.c b/fs/btrfs/zstd.c index 8e9626d63976..14418b02c189 100644 --- a/fs/btrfs/zstd.c +++ b/fs/btrfs/zstd.c @@ -28,10 +28,10 @@ /* 307s to avoid pathologically clashing with transaction commit */ #define ZSTD_BTRFS_RECLAIM_JIFFIES (307 * HZ) -static ZSTD_parameters zstd_get_btrfs_parameters(unsigned int level, +static zstd_parameters zstd_get_btrfs_parameters(unsigned int level, size_t src_len) { - ZSTD_parameters params = ZSTD_getParams(level, src_len, 0); +
[GIT PULL][PATCH v9 0/3] Update to zstd-1.4.10
From: Nick Terrell Please pull from g...@github.com:terrelln/linux.git tags/v9-zstd-1.4.10 to get these changes. Alternatively the patchset is included. This patchset upgrades the zstd library to the latest upstream release. The current zstd version in the kernel is a modified version of upstream zstd-1.3.1. At the time it was integrated, zstd wasn't ready to be used in the kernel as-is. But, it is now possible to use upstream zstd directly in the kernel. I have not yet released zstd-1.4.10 upstream. I want the zstd version in the kernel to match up with a known upstream release, so we know exactly what code is running. Whenever this patchset is ready for merge, I will cut a release at the upstream commit that gets merged. This should not be necessary for future releases. The kernel zstd library is automatically generated from upstream zstd. A script makes the necessary changes and imports it into the kernel. The changes are: 1. Replace all libc dependencies with kernel replacements and rewrite includes. 2. Remove unncessary portability macros like: #if defined(_MSC_VER). 3. Use the kernel xxhash instead of bundling it. This automation gets tested every commit by upstream's continuous integration. When we cut a new zstd release, we will submit a patch to the kernel to update the zstd version in the kernel. I've updated zstd to upstream with one big patch because every commit must build, so that precludes partial updates. Since the commit is 100% generated, I hope the review burden is lightened. I considered replaying upstream commits, but that is not possible because there have been ~3500 upstream commits since the last zstd import, and the commits don't all build individually. The bulk update preserves bisectablity because bugs can be bisected to the zstd version update. At that point the update can be reverted, and we can work with upstream to find and fix the bug. After this big switch in how the kernel consumes zstd, future patches will be smaller, because they will only have one upstream release worth of changes each. This patchset adds a new kernel-style wrapper around zstd. This wrapper API is functionally equivalent to the subset of the current zstd API that is currently used. The wrapper API changes to be kernel style so that the symbols don't collide with zstd's symbols. The update to zstd-1.4.6 maintains the same API and preserves the semantics, so that none of the callers need to be updated. This patchset comes in 2 parts: 1. The first 2 patches prepare for the zstd upgrade. The first patch adds the new kernel style API so zstd can be upgraded without modifying any callers. The second patch adds an indirection for the lib/decompress_unzstd.c including of all decompression source files. 2. Import zstd-1.4.10. This patch is completely generated from upstream using automated tooling. I tested every caller of zstd on x86_64. I tested both after the 1.4.10 upgrade using the compatibility wrapper, and after the final patch in this series. I tested kernel and initramfs decompression in i386 and arm. I ran benchmarks to compare the current zstd in the kernel with zstd-1.4.6. I benchmarked on x86_64 using QEMU with KVM enabled on an Intel i9-9900k. I found: * BtrFS zstd compression at levels 1 and 3 is 5% faster * BtrFS zstd decompression+read is 15% faster * SquashFS zstd decompression+read is 15% faster * F2FS zstd compression+write at level 3 is 8% faster * F2FS zstd decompression+read is 20% faster * ZRAM decompression+read is 30% faster * Kernel zstd decompression is 35% faster * Initramfs zstd decompression+build is 5% faster The latest zstd also offers bug fixes. For example the problem with large kernel decompression has been fixed upstream for over 2 years https://lkml.org/lkml/2020/9/29/27. Please let me know if there is anything that I can do to ease the way for these patches. I think it is important because it gets large performance improvements, contains bug fixes, and is switching to a more maintainable model of consuming upstream zstd directly, making it easy to keep up to date. Best, Nick Terrell v1 -> v2: * Successfully tested F2FS with help from Chao Yu to fix my test. * (1/9) Fix ZSTD_initCStream() wrapper to handle pledged_src_size=0 means unknown. This fixes F2FS with the zstd-1.4.6 compatibility wrapper, exposed by the test. v2 -> v3: * (3/9) Silence warnings by Kernel Test Robot: https://github.com/facebook/zstd/pull/2324 Stack size warnings remain, but these aren't new, and the functions it warns on are either unused or not in the maximum stack path. This patchset reduces zstd compression stack usage by 1 KB overall. I've gotten the low hanging fruit, and more stack reduction would require significant changes that have the potential to introduce new bugs. However, I do hope to continue to reduce zstd stack usage in future versions. v3 -> v4: * (3/9) Fix errors and warnings reported by Kernel Test Robot
Re: [PATCH v8 1/3] lib: zstd: Add kernel-specific API
On Sat, Mar 27, 2021 at 2:48 PM Oleksandr Natalenko wrote: > > Hello. > > On Sat, Mar 27, 2021 at 05:48:01PM +0800, kernel test robot wrote: > > >> ERROR: modpost: "ZSTD_maxCLevel" [fs/f2fs/f2fs.ko] undefined! > > Since f2fs can be built as a module, the following correction seems to > be needed: Thanks Oleksandr! Looks like f2fs has been updated to use ZSTD_maxCLevel() since the first version of these patches. I'll put up a new version shortly with the fix, and update my test suite to build f2fs and other users as modules, so it can catch this. Best, Nick > ``` > diff --git a/lib/zstd/compress/zstd_compress.c > b/lib/zstd/compress/zstd_compress.c > index 9c998052a0e5..584c92c51169 100644 > --- a/lib/zstd/compress/zstd_compress.c > +++ b/lib/zstd/compress/zstd_compress.c > @@ -4860,6 +4860,7 @@ size_t ZSTD_endStream(ZSTD_CStream* zcs, > ZSTD_outBuffer* output) > > #define ZSTD_MAX_CLEVEL 22 > int ZSTD_maxCLevel(void) { return ZSTD_MAX_CLEVEL; } > +EXPORT_SYMBOL(ZSTD_maxCLevel); > int ZSTD_minCLevel(void) { return (int)-ZSTD_TARGETLENGTH_MAX; } > > static const ZSTD_compressionParameters > ZSTD_defaultCParameters[4][ZSTD_MAX_CLEVEL+1] = { > ``` > > Not sure if the same should be done for `ZSTD_minCLevel()` since I don't > see it being used anywhere else. > > -- > Oleksandr Natalenko (post-factum)
Re: [PATCH v8 3/3] lib: zstd: Upgrade to latest upstream zstd version 1.4.10
On Fri, Mar 26, 2021 at 3:02 PM kernel test robot wrote: > > Hi Nick, > > Thank you for the patch! Perhaps something to improve: > > [auto build test WARNING on cryptodev/master] > [also build test WARNING on kdave/for-next f2fs/dev-test linus/master > v5.12-rc4 next-20210326] > [cannot apply to crypto/master kees/for-next/pstore squashfs/master] > [If your patch is applied to the wrong git tree, kindly drop us a note. > And when submitting patch, we suggest to use '--base' as documented in > https://git-scm.com/docs/git-format-patch] > > url: > https://github.com/0day-ci/linux/commits/Nick-Terrell/Update-to-zstd-1-4-10/20210327-031827 > base: > https://git.kernel.org/pub/scm/linux/kernel/git/herbert/cryptodev-2.6.git > master > config: um-allmodconfig (attached as .config) > compiler: gcc-9 (Debian 9.3.0-22) 9.3.0 > reproduce (this is a W=1 build): > # > https://github.com/0day-ci/linux/commit/ebbff13fa6a537fb8b3dc6b42c3093f9ce4358f8 > git remote add linux-review https://github.com/0day-ci/linux > git fetch --no-tags linux-review > Nick-Terrell/Update-to-zstd-1-4-10/20210327-031827 > git checkout ebbff13fa6a537fb8b3dc6b42c3093f9ce4358f8 > # save the attached .config to linux build tree > make W=1 ARCH=um > > If you fix the issue, kindly add following tag as appropriate > Reported-by: kernel test robot > > All warnings (new ones prefixed by >>): > >lib/zstd/compress/zstd_compress_sequences.c:17: warning: Cannot understand > * -log2(x / 256) lookup table for x in [0, 256). > on line 17 - I thought it was a doc line >lib/zstd/compress/zstd_compress_sequences.c:58: warning: Function > parameter or member 'nbSeq' not described in 'ZSTD_useLowProbCount' > >> lib/zstd/compress/zstd_compress_sequences.c:58: warning: expecting > >> prototype for 1 else we should(). Prototype was for ZSTD_useLowProbCount() > >> instead > >> lib/zstd/compress/zstd_compress_sequences.c:67: warning: wrong kernel-doc > >> identifier on line: > * Returns the cost in bytes of encoding the normalized count header. >lib/zstd/compress/zstd_compress_sequences.c:85: warning: Function > parameter or member 'count' not described in 'ZSTD_entropyCost' >lib/zstd/compress/zstd_compress_sequences.c:85: warning: Function > parameter or member 'max' not described in 'ZSTD_entropyCost' >lib/zstd/compress/zstd_compress_sequences.c:85: warning: Function > parameter or member 'total' not described in 'ZSTD_entropyCost' > >> lib/zstd/compress/zstd_compress_sequences.c:85: warning: expecting > >> prototype for Returns the cost in bits of encoding the distribution > >> described by count(). Prototype was for ZSTD_entropyCost() instead >lib/zstd/compress/zstd_compress_sequences.c:99: warning: wrong kernel-doc > identifier on line: > * Returns the cost in bits of encoding the distribution in count using > ctable. >lib/zstd/compress/zstd_compress_sequences.c:139: warning: Function > parameter or member 'norm' not described in 'ZSTD_crossEntropyCost' >lib/zstd/compress/zstd_compress_sequences.c:139: warning: Function > parameter or member 'accuracyLog' not described in 'ZSTD_crossEntropyCost' >lib/zstd/compress/zstd_compress_sequences.c:139: warning: Function > parameter or member 'count' not described in 'ZSTD_crossEntropyCost' >lib/zstd/compress/zstd_compress_sequences.c:139: warning: Function > parameter or member 'max' not described in 'ZSTD_crossEntropyCost' > >> lib/zstd/compress/zstd_compress_sequences.c:139: warning: expecting > >> prototype for Returns the cost in bits of encoding the distribution in > >> count using the(). Prototype was for ZSTD_crossEntropyCost() instead > -- >lib/zstd/compress/zstd_ldm.c:584: warning: Function parameter or member > 'rawSeqStore' not described in 'maybeSplitSequence' >lib/zstd/compress/zstd_ldm.c:584: warning: Function parameter or member > 'remaining' not described in 'maybeSplitSequence' >lib/zstd/compress/zstd_ldm.c:584: warning: Function parameter or member > 'minMatch' not described in 'maybeSplitSequence' > >> lib/zstd/compress/zstd_ldm.c:584: warning: expecting prototype for If the > >> sequence length is longer than remaining then the sequence is split(). > >> Prototype was for maybeSplitSequence() instead > -- > >> lib/zstd/decompress/zstd_decompress.c:992: warning: wrong kernel-doc > >> identifier on line: > * Similar to ZSTD_nextSrcSizeToDecompress(), but when when a block input > can be streamed, > -- >lib/zstd/decompress/huf_decompress.c:122: warning: Function parameter or > member 'symb
[PATCH v8 2/3] lib: zstd: Add decompress_sources.h for decompress_unzstd
From: Nick Terrell Adds decompress_sources.h which includes every .c file necessary for zstd decompression. This is used in decompress_unzstd.c so the internal structure of the library isn't exposed. This allows us to upgrade the zstd library version without modifying any callers. Instead we just need to update decompress_sources.h. Signed-off-by: Nick Terrell --- lib/decompress_unzstd.c | 6 +- lib/zstd/decompress_sources.h | 14 ++ 2 files changed, 15 insertions(+), 5 deletions(-) create mode 100644 lib/zstd/decompress_sources.h diff --git a/lib/decompress_unzstd.c b/lib/decompress_unzstd.c index dab2d55cf08d..e6897a5063a7 100644 --- a/lib/decompress_unzstd.c +++ b/lib/decompress_unzstd.c @@ -68,11 +68,7 @@ #ifdef STATIC # define UNZSTD_PREBOOT # include "xxhash.c" -# include "zstd/entropy_common.c" -# include "zstd/fse_decompress.c" -# include "zstd/huf_decompress.c" -# include "zstd/zstd_common.c" -# include "zstd/decompress.c" +# include "zstd/decompress_sources.h" #endif #include diff --git a/lib/zstd/decompress_sources.h b/lib/zstd/decompress_sources.h new file mode 100644 index ..d2fe10af0043 --- /dev/null +++ b/lib/zstd/decompress_sources.h @@ -0,0 +1,14 @@ +/* SPDX-License-Identifier: GPL-2.0-only */ + +/* + * This file includes every .c file needed for decompression. + * It is used by lib/decompress_unzstd.c to include the decompression + * source into the translation-unit, so it can be used for kernel + * decompression. + */ + +#include "entropy_common.c" +#include "fse_decompress.c" +#include "huf_decompress.c" +#include "zstd_common.c" +#include "decompress.c" -- 2.31.0
[PATCH v8 0/3] Update to zstd-1.4.10
From: Nick Terrell Please pull from g...@github.com:terrelln/linux.git tags/v8-zstd-1.4.10 to get these changes. Alternatively the patchset is included. This patchset upgrades the zstd library to the latest upstream release. The current zstd version in the kernel is a modified version of upstream zstd-1.3.1. At the time it was integrated, zstd wasn't ready to be used in the kernel as-is. But, it is now possible to use upstream zstd directly in the kernel. I have not yet released zstd-1.4.10 upstream. I want the zstd version in the kernel to match up with a known upstream release, so we know exactly what code is running. Whenever this patchset is ready for merge, I will cut a release at the upstream commit that gets merged. This should not be necessary for future releases. The kernel zstd library is automatically generated from upstream zstd. A script makes the necessary changes and imports it into the kernel. The changes are: 1. Replace all libc dependencies with kernel replacements and rewrite includes. 2. Remove unncessary portability macros like: #if defined(_MSC_VER). 3. Use the kernel xxhash instead of bundling it. This automation gets tested every commit by upstream's continuous integration. When we cut a new zstd release, we will submit a patch to the kernel to update the zstd version in the kernel. I've updated zstd to upstream with one big patch because every commit must build, so that precludes partial updates. Since the commit is 100% generated, I hope the review burden is lightened. I considered replaying upstream commits, but that is not possible because there have been ~3500 upstream commits since the last zstd import, and the commits don't all build individually. The bulk update preserves bisectablity because bugs can be bisected to the zstd version update. At that point the update can be reverted, and we can work with upstream to find and fix the bug. After this big switch in how the kernel consumes zstd, future patches will be smaller, because they will only have one upstream release worth of changes each. This patchset adds a new kernel-style wrapper around zstd. This wrapper API is functionally equivalent to the subset of the current zstd API that is currently used. The wrapper API changes to be kernel style so that the symbols don't collide with zstd's symbols. The update to zstd-1.4.10 maintains the same API and preserves the semantics, so that none of the callers need to be updated. This patchset comes in 2 parts: 1. The first 2 patches prepare for the zstd upgrade. The first patch adds the new kernel style API so zstd can be upgraded without modifying any callers. The second patch adds an indirection for the lib/decompress_unzstd.c including of all decompression source files. 2. Import zstd-1.4.10. This patch is completely generated from upstream using automated tooling. I tested every caller of zstd on x86_64. I tested both after the 1.4.10 upgrade using the compatibility wrapper, and after the final patch in this series. I tested kernel and initramfs decompression in i386 and arm. I ran benchmarks to compare the current zstd in the kernel with zstd-1.4.10. I benchmarked on x86_64 using QEMU with KVM enabled on an Intel i9-9900k. I found: * BtrFS zstd compression at levels 1 and 3 is 5% faster * BtrFS zstd decompression+read is 15% faster * SquashFS zstd decompression+read is 15% faster * F2FS zstd compression+write at level 3 is 8% faster * F2FS zstd decompression+read is 20% faster * ZRAM decompression+read is 30% faster * Kernel zstd decompression is 35% faster * Initramfs zstd decompression+build is 5% faster The latest zstd also offers bug fixes. For example the problem with large kernel decompression has been fixed upstream for over 2 years https://lkml.org/lkml/2020/9/29/27. Please let me know if there is anything that I can do to ease the way for these patches. I think it is important because it gets large performance improvements, contains bug fixes, and is switching to a more maintainable model of consuming upstream zstd directly, making it easy to keep up to date. Best, Nick Terrell v1 -> v2: * Successfully tested F2FS with help from Chao Yu to fix my test. * (1/9) Fix ZSTD_initCStream() wrapper to handle pledged_src_size=0 means unknown. This fixes F2FS with the zstd-1.4.6 compatibility wrapper, exposed by the test. v2 -> v3: * (3/9) Silence warnings by Kernel Test Robot: https://github.com/facebook/zstd/pull/2324 Stack size warnings remain, but these aren't new, and the functions it warns on are either unused or not in the maximum stack path. This patchset reduces zstd compression stack usage by 1 KB overall. I've gotten the low hanging fruit, and more stack reduction would require significant changes that have the potential to introduce new bugs. However, I do hope to continue to reduce zstd stack usage in future versions. v3 -> v4: * (3/9) Fix errors and warnings reported by Kernel Test Robot
[PATCH v8 1/3] lib: zstd: Add kernel-specific API
From: Nick Terrell This patch: - Moves `include/linux/zstd.h` -> `include/linux/zstd_lib.h` - Adds a new API in `include/linux/zstd.h` that is functionally equivalent to the in-use subset of the current API. Functions are renamed to avoid symbol collisions with zstd, to make it clear it is not the upstream zstd API, and to follow the kernel style guide. - Updates all callers to use the new API. There are no functional changes in this patch. Since there are no functional change, I felt it was okay to update all the callers in a single patch. Once the API is approved, the callers are mechanically changed. This patch is preparing for the 3rd patch in this series, which updates zstd to version 1.4.10. Since the upstream zstd API is no longer exposed to callers, the update can happen transparently. Signed-off-by: Nick Terrell --- crypto/zstd.c | 28 +- fs/btrfs/zstd.c| 68 +- fs/f2fs/compress.c | 56 +- fs/pstore/platform.c |2 +- fs/squashfs/zstd_wrapper.c | 16 +- include/linux/zstd.h | 1218 include/linux/zstd_lib.h | 1157 ++ lib/decompress_unzstd.c| 42 +- lib/zstd/compress.c| 107 ++-- lib/zstd/decompress.c | 112 ++-- 10 files changed, 1657 insertions(+), 1149 deletions(-) create mode 100644 include/linux/zstd_lib.h diff --git a/crypto/zstd.c b/crypto/zstd.c index 1a3309f066f7..154a969c83a8 100644 --- a/crypto/zstd.c +++ b/crypto/zstd.c @@ -18,22 +18,22 @@ #define ZSTD_DEF_LEVEL 3 struct zstd_ctx { - ZSTD_CCtx *cctx; - ZSTD_DCtx *dctx; + zstd_cctx *cctx; + zstd_dctx *dctx; void *cwksp; void *dwksp; }; -static ZSTD_parameters zstd_params(void) +static zstd_parameters zstd_params(void) { - return ZSTD_getParams(ZSTD_DEF_LEVEL, 0, 0); + return zstd_get_params(ZSTD_DEF_LEVEL, 0); } static int zstd_comp_init(struct zstd_ctx *ctx) { int ret = 0; - const ZSTD_parameters params = zstd_params(); - const size_t wksp_size = ZSTD_CCtxWorkspaceBound(params.cParams); + const zstd_parameters params = zstd_params(); + const size_t wksp_size = zstd_cctx_workspace_bound(); ctx->cwksp = vzalloc(wksp_size); if (!ctx->cwksp) { @@ -41,7 +41,7 @@ static int zstd_comp_init(struct zstd_ctx *ctx) goto out; } - ctx->cctx = ZSTD_initCCtx(ctx->cwksp, wksp_size); + ctx->cctx = zstd_init_cctx(ctx->cwksp, wksp_size); if (!ctx->cctx) { ret = -EINVAL; goto out_free; @@ -56,7 +56,7 @@ static int zstd_comp_init(struct zstd_ctx *ctx) static int zstd_decomp_init(struct zstd_ctx *ctx) { int ret = 0; - const size_t wksp_size = ZSTD_DCtxWorkspaceBound(); + const size_t wksp_size = zstd_dctx_workspace_bound(); ctx->dwksp = vzalloc(wksp_size); if (!ctx->dwksp) { @@ -64,7 +64,7 @@ static int zstd_decomp_init(struct zstd_ctx *ctx) goto out; } - ctx->dctx = ZSTD_initDCtx(ctx->dwksp, wksp_size); + ctx->dctx = zstd_init_dctx(ctx->dwksp, wksp_size); if (!ctx->dctx) { ret = -EINVAL; goto out_free; @@ -152,10 +152,10 @@ static int __zstd_compress(const u8 *src, unsigned int slen, { size_t out_len; struct zstd_ctx *zctx = ctx; - const ZSTD_parameters params = zstd_params(); + const zstd_parameters params = zstd_params(); - out_len = ZSTD_compressCCtx(zctx->cctx, dst, *dlen, src, slen, params); - if (ZSTD_isError(out_len)) + out_len = zstd_compress_cctx(zctx->cctx, dst, *dlen, src, slen, ); + if (zstd_is_error(out_len)) return -EINVAL; *dlen = out_len; return 0; @@ -182,8 +182,8 @@ static int __zstd_decompress(const u8 *src, unsigned int slen, size_t out_len; struct zstd_ctx *zctx = ctx; - out_len = ZSTD_decompressDCtx(zctx->dctx, dst, *dlen, src, slen); - if (ZSTD_isError(out_len)) + out_len = zstd_decompress_dctx(zctx->dctx, dst, *dlen, src, slen); + if (zstd_is_error(out_len)) return -EINVAL; *dlen = out_len; return 0; diff --git a/fs/btrfs/zstd.c b/fs/btrfs/zstd.c index 8e9626d63976..14418b02c189 100644 --- a/fs/btrfs/zstd.c +++ b/fs/btrfs/zstd.c @@ -28,10 +28,10 @@ /* 307s to avoid pathologically clashing with transaction commit */ #define ZSTD_BTRFS_RECLAIM_JIFFIES (307 * HZ) -static ZSTD_parameters zstd_get_btrfs_parameters(unsigned int level, +static zstd_parameters zstd_get_btrfs_parameters(unsigned int level, size_t src_len) { - ZSTD_parameters params = ZSTD_getParams(level, src_len, 0); + zstd_parameters params = zstd_get_params(level, src_len); if (params.cParams.windowLog > ZSTD_B
Re: [f2fs-dev] [PATCH v7 0/3] Update to zstd-1.4.6
> On Dec 16, 2020, at 5:23 PM, Michał Mirosław wrote: > > On Wed, Dec 16, 2020 at 10:07:38PM +0000, Nick Terrell wrote: > [...] >> It is very large. If it helps, in the commit message I’ve provided this link >> [0], >> which provides the diff between upstream zstd as-is and the imported zstd, >> which has been modified by the automated tooling to work in the kernel. >> [0] >> https://github.com/terrelln/linux/commit/ac2ee65dcb7318afe426ad08f6a844faf3aebb41 > > I looks like you could remove a bit more dead code by noting __GNUC__ >= 4 > (gcc-4.9 is currently the oldest supported [1]). Yeah, that would certainly be possible. My goal was to remove the most egregiously irrelevant code from the kernel, in addition to unused functions which would generate -Wframe-larger-than compiler warnings. My tooling doesn’t have the logic to reason about >= relationships yet. If it isn’t too hard to add, I may go ahead and do that, otherwise I will leave it for future work. I view that as a “nice to have” instead of a hard requirement, though let me know if you disagree. Best, Nick > [1] Documentation/process/changes.rst > > Best Regards > Michał Mirosław
Re: [f2fs-dev] [PATCH v7 0/3] Update to zstd-1.4.6
> On Dec 16, 2020, at 10:50 AM, David Sterba wrote: > > On Wed, Dec 16, 2020 at 11:58:07AM +1100, Herbert Xu wrote: >> On Wed, Dec 16, 2020 at 12:48:51AM +0000, Nick Terrell wrote: >>> >>> Thanks for the advice! The first zstd patches went through Herbert’s tree, >>> which is >>> why I’ve sent them this way. >> >> Sorry, but I'm not touch these patches as Christoph's objections >> don't seem to have been addressed. > > I have objections to the current patchset as well, the build bot has > found that some of the function frames are overly large (up to 3800 > bytes) [1], Sorry I missed your reply David, it didn’t make it to my inbox. Compiled with x86-64, arm, and aarch64 that function does not trigger any -Wframe-larger-than= warnings during the kernel build. It seems like the compiler backend for the parisc architecture (the architecture that the build bot used) is doing a particularly bad job at optimizing this function, because there is nothing in there that should be using that much stack space. I have a test in upstream zstd that measures the stack high water mark for all usage of zstd compression currently in-use the kernel. It says that zstd uses 2KB of stack space in total on x86-64. I used this test to remove 1KB of stack usage from upstream zstd. But, this is still 400 bytes more than the current version of zstd in the kernel. I will look into squeezing out those last 400 bytes of stack usage. > besides the original complaint that the patch 3/3 is 1.5MiB. > > [1] https://lore.kernel.org/lkml/20201204140314.gs6...@twin.jikos.cz/ It is very large. If it helps, in the commit message I’ve provided this link [0], which provides the diff between upstream zstd as-is and the imported zstd, which has been modified by the automated tooling to work in the kernel. [0] https://github.com/terrelln/linux/commit/ac2ee65dcb7318afe426ad08f6a844faf3aebb41 Best, Nick
Re: [f2fs-dev] [PATCH v7 0/3] Update to zstd-1.4.6
> On Dec 15, 2020, at 4:58 PM, Herbert Xu wrote: > > On Wed, Dec 16, 2020 at 12:48:51AM +0000, Nick Terrell wrote: >> >> Thanks for the advice! The first zstd patches went through Herbert’s tree, >> which is >> why I’ve sent them this way. > > Sorry, but I'm not touch these patches as Christoph's objections > don't seem to have been addressed. I believe I’ve addressed Christoph's objections. He suggested creating a wrapper API to avoid changing callers upon the zstd update. I’ve done that, the only difference between the current API, and the changes I’ve proposed patch 1, is that I’ve changed the prefix from ZSTD_ to zstd_ to avoid conflicts & confusion with the upstream zstd API. Cristoph, if you get a chance to take a look at these patches, please let me know what you think about the current iteration of patches, and if I’ve addressed all of your concerns. Best, Nick
Re: [f2fs-dev] [PATCH v7 0/3] Update to zstd-1.4.6
> On Dec 15, 2020, at 4:00 PM, Eric Biggers wrote: > > On Tue, Dec 15, 2020 at 08:58:52PM +0000, Nick Terrell via Linux-f2fs-devel > wrote: >> >> >>> On Dec 3, 2020, at 12:51 PM, Nick Terrell wrote: >>> >>> From: Nick Terrell >>> >>> Please pull from >>> >>> g...@github.com:terrelln/linux.git tags/v7-zstd-1.4.6 >>> >>> to get these changes. Alternatively the patchset is included. >> >> Is it possible to get this patchset merged in the 5.11 merge window? It >> applies >> cleanly to the latest master. Please let me know if there is anything that I >> can do >> to drive this patchset towards merge. >> >> Thanks, >> Nick > > Well, it's too late for 5.11 for patches that weren't already in linux-next, > so > you'll have to aim for 5.12. > > It looks like you're asking Herbert to pull this into the crypto tree? If > he's > interested in doing that, that could work. However lib/zstd/ isn't that > strongly "crypto-related", and it doesn't actually have a maintainer listed in > MAINTAINERS. Perhaps another path forwards is for you to volunteer to > maintain > lib/zstd/ and send pull requests for it directly to Linus? Thanks for the advice! The first zstd patches went through Herbert’s tree, which is why I’ve sent them this way. I’d be happy to maintain zstd (and lz4, and xxhash), though I don’t know exactly what that entails. Best, Nick > - Eric
Re: [PATCH v7 0/3] Update to zstd-1.4.6
> On Dec 3, 2020, at 12:51 PM, Nick Terrell wrote: > > From: Nick Terrell > > Please pull from > > g...@github.com:terrelln/linux.git tags/v7-zstd-1.4.6 > > to get these changes. Alternatively the patchset is included. Is it possible to get this patchset merged in the 5.11 merge window? It applies cleanly to the latest master. Please let me know if there is anything that I can do to drive this patchset towards merge. Thanks, Nick > This patchset upgrades the zstd library to the latest upstream release. The > current zstd version in the kernel is a modified version of upstream > zstd-1.3.1. > At the time it was integrated, zstd wasn't ready to be used in the kernel > as-is. > But, it is now possible to use upstream zstd directly in the kernel. > > I have not yet released zstd-1.4.6 upstream. I want the zstd version in the > kernel to match up with a known upstream release, so we know exactly what code > is running. Whenever this patchset is ready for merge, I will cut a release at > the upstream commit that gets merged. This should not be necessary for future > releases. > > The kernel zstd library is automatically generated from upstream zstd. A > script > makes the necessary changes and imports it into the kernel. The changes are: > > 1. Replace all libc dependencies with kernel replacements and rewrite > includes. > 2. Remove unncessary portability macros like: #if defined(_MSC_VER). > 3. Use the kernel xxhash instead of bundling it. > > This automation gets tested every commit by upstream's continuous integration. > When we cut a new zstd release, we will submit a patch to the kernel to update > the zstd version in the kernel. > > I've updated zstd to upstream with one big patch because every commit must > build, > so that precludes partial updates. Since the commit is 100% generated, I hope > the > review burden is lightened. I considered replaying upstream commits, but that > is > not possible because there have been ~3500 upstream commits since the last > zstd > import, and the commits don't all build individually. The bulk update > preserves > bisectablity because bugs can be bisected to the zstd version update. At that > point the update can be reverted, and we can work with upstream to find and > fix > the bug. After this big switch in how the kernel consumes zstd, future patches > will be smaller, because they will only have one upstream release worth of > changes each. > > This patchset adds a new kernel-style wrapper around zstd. This wrapper API is > functionally equivalent to the subset of the current zstd API that is > currently > used. The wrapper API changes to be kernel style so that the symbols don't > collide with zstd's symbols. The update to zstd-1.4.6 maintains the same API > and preserves the semantics, so that none of the callers need to be updated. > > This patchset comes in 2 parts: > 1. The first 2 patches prepare for the zstd upgrade. The first patch adds the > new kernel style API so zstd can be upgraded without modifying any callers. > The second patch adds an indirection for the lib/decompress_unzstd.c > including of all decompression source files. > 2. Import zstd-1.4.6. This patch is completely generated from upstream using > automated tooling. > > I tested every caller of zstd on x86_64. I tested both after the 1.4.6 upgrade > using the compatibility wrapper, and after the final patch in this series. > > I tested kernel and initramfs decompression in i386 and arm. > > I ran benchmarks to compare the current zstd in the kernel with zstd-1.4.6. > I benchmarked on x86_64 using QEMU with KVM enabled on an Intel i9-9900k. > I found: > * BtrFS zstd compression at levels 1 and 3 is 5% faster > * BtrFS zstd decompression+read is 15% faster > * SquashFS zstd decompression+read is 15% faster > * F2FS zstd compression+write at level 3 is 8% faster > * F2FS zstd decompression+read is 20% faster > * ZRAM decompression+read is 30% faster > * Kernel zstd decompression is 35% faster > * Initramfs zstd decompression+build is 5% faster > > The latest zstd also offers bug fixes and a 1 KB reduction in stack uage > during > compression. For example the recent problem with large kernel decompression > has > been fixed upstream for over 2 years https://lkml.org/lkml/2020/9/29/27. > > Please let me know if there is anything that I can do to ease the way for > these > patches. I think it is important because it gets large performance > improvements, > contains bug fixes, and is switching to a more maintainable model of consuming > upstream zstd directly, making it easy to keep up to date. > > Best, > Nick Terrell > > v1 -> v2: > * S
Re: [PATCH v6 1/3] lib: zstd: Add kernel-specific API
> On Dec 2, 2020, at 9:03 PM, Michał Mirosław wrote: > > On Thu, Dec 03, 2020 at 03:59:21AM +0000, Nick Terrell wrote: >> On Dec 2, 2020, at 7:14 PM, Michał Mirosław wrote: >>> On Thu, Dec 03, 2020 at 01:42:03AM +, Nick Terrell wrote: >>>> On Dec 2, 2020, at 5:16 PM, Michał Mirosław >>>> wrote: >>>>> On Wed, Dec 02, 2020 at 12:32:40PM -0800, Nick Terrell wrote: >>>>>> From: Nick Terrell >>>>>> >>>>>> This patch: >>>>>> - Moves `include/linux/zstd.h` -> `lib/zstd/zstd.h` >>>>>> - Adds a new API in `include/linux/zstd.h` that is functionally >>>>>> equivalent to the in-use subset of the current API. Functions are >>>>>> renamed to avoid symbol collisions with zstd, to make it clear it is >>>>>> not the upstream zstd API, and to follow the kernel style guide. >>>>>> - Updates all callers to use the new API. >>>>>> >>>>>> There are no functional changes in this patch. Since there are no >>>>>> functional change, I felt it was okay to update all the callers in a >>>>>> single patch, since once the API is approved, the callers are >>>>>> mechanically changed. >>>>> [...] >>>>>> --- a/lib/decompress_unzstd.c >>>>>> +++ b/lib/decompress_unzstd.c >>>>> [...] >>>>>> static int INIT handle_zstd_error(size_t ret, void (*error)(char *x)) >>>>>> { >>>>>> -const int err = ZSTD_getErrorCode(ret); >>>>>> - >>>>>> -if (!ZSTD_isError(ret)) >>>>>> +if (!zstd_is_error(ret)) >>>>>> return 0; >>>>>> >>>>>> -switch (err) { >>>>>> -case ZSTD_error_memory_allocation: >>>>>> -error("ZSTD decompressor ran out of memory"); >>>>>> -break; >>>>>> -case ZSTD_error_prefix_unknown: >>>>>> -error("Input is not in the ZSTD format (wrong magic >>>>>> bytes)"); >>>>>> -break; >>>>>> -case ZSTD_error_dstSize_tooSmall: >>>>>> -case ZSTD_error_corruption_detected: >>>>>> -case ZSTD_error_checksum_wrong: >>>>>> -error("ZSTD-compressed data is corrupt"); >>>>>> -break; >>>>>> -default: >>>>>> -error("ZSTD-compressed data is probably corrupt"); >>>>>> -break; >>>>>> -} >>>>>> +error("ZSTD decompression failed"); >>>>>> return -1; >>>>>> } >>>>> >>>>> This looses diagnostics specificity - is this intended? At least the >>>>> out-of-memory condition seems useful to distinguish. >>>> >>>> Good point. The zstd API no longer exposes the error code enum, >>>> but it does expose zstd_get_error_name() which can be used here. >>>> I was thinking that the string needed to be static for some reason, but >>>> that is not the case. I will make that change. >>>> >>>>>> +size_t zstd_compress_stream(zstd_cstream *cstream, >>>>>> +struct zstd_out_buffer *output, struct zstd_in_buffer *input) >>>>>> +{ >>>>>> +ZSTD_outBuffer o; >>>>>> +ZSTD_inBuffer i; >>>>>> +size_t ret; >>>>>> + >>>>>> +memcpy(, output, sizeof(o)); >>>>>> +memcpy(, input, sizeof(i)); >>>>>> +ret = ZSTD_compressStream(cstream, , ); >>>>>> +memcpy(output, , sizeof(o)); >>>>>> +memcpy(input, , sizeof(i)); >>>>>> +return ret; >>>>>> +} >>>>> >>>>> Is all this copying necessary? How is it different from type-punning by >>>>> direct pointer cast? >>>> >>>> If breaking strict aliasing and type-punning by pointer casing is okay, >>>> then >>>> we can do that here. These memcpys will be negligible for performance, but >>>> type-punning would be more succinct if allowed. >>> >>> Ah, this might break LTO builds due to strict aliasing violation. >>> So I would suggest to just #define the ZSTD names to kernel ones >>> for the library code. Unless there is a cleaner solution... >> >> I don’t want to do that because I want in the 3rd series of the patchset I >> update >> to zstd-1.4.6. And I’m using zstd-1.4.6 as-is in upstream. This allows us to >> keep >> the kernel version up to date, since the patch to update to a new version >> can be >> generated automatically (and manually tested), so it doesn’t fall years >> behind >> upstream again. >> >> The alternative would be to make upstream zstd’s header public and >> #define zstd_in_buffer ZSTD_inBuffer. But that would make zstd’s header >> public, which would somewhat defeat the purpose of having a kernel wrapper. > > I thought the problem was API style spill-over from the library to other parts > of the kernel. A header-only wrapper can stop this. I'm not sure symbol > visibility (namespace pollution) was a concern. Thanks for the review Michał! I have just submitted a new version of the patches with the suggested changes! Best, Nick Terrell > Best Regards > Michał Mirosław
[PATCH v7 2/3] lib: zstd: Add decompress_sources.h for decompress_unzstd
From: Nick Terrell Adds decompress_sources.h which includes every .c file necessary for zstd decompression. This is used in decompress_unzstd.c so the internal structure of the library isn't exposed. This allows us to upgrade the zstd library version without modifying any callers. Instead we just need to update decompress_sources.h. Signed-off-by: Nick Terrell --- lib/decompress_unzstd.c | 6 +- lib/zstd/decompress_sources.h | 14 ++ 2 files changed, 15 insertions(+), 5 deletions(-) create mode 100644 lib/zstd/decompress_sources.h diff --git a/lib/decompress_unzstd.c b/lib/decompress_unzstd.c index dab2d55cf08d..e6897a5063a7 100644 --- a/lib/decompress_unzstd.c +++ b/lib/decompress_unzstd.c @@ -68,11 +68,7 @@ #ifdef STATIC # define UNZSTD_PREBOOT # include "xxhash.c" -# include "zstd/entropy_common.c" -# include "zstd/fse_decompress.c" -# include "zstd/huf_decompress.c" -# include "zstd/zstd_common.c" -# include "zstd/decompress.c" +# include "zstd/decompress_sources.h" #endif #include diff --git a/lib/zstd/decompress_sources.h b/lib/zstd/decompress_sources.h new file mode 100644 index ..d2fe10af0043 --- /dev/null +++ b/lib/zstd/decompress_sources.h @@ -0,0 +1,14 @@ +/* SPDX-License-Identifier: GPL-2.0-only */ + +/* + * This file includes every .c file needed for decompression. + * It is used by lib/decompress_unzstd.c to include the decompression + * source into the translation-unit, so it can be used for kernel + * decompression. + */ + +#include "entropy_common.c" +#include "fse_decompress.c" +#include "huf_decompress.c" +#include "zstd_common.c" +#include "decompress.c" -- 2.29.2
[PATCH v7 1/3] lib: zstd: Add kernel-specific API
From: Nick Terrell This patch: - Moves `include/linux/zstd.h` -> `include/linux/zstd_lib.h` - Adds a new API in `include/linux/zstd.h` that is functionally equivalent to the in-use subset of the current API. Functions are renamed to avoid symbol collisions with zstd, to make it clear it is not the upstream zstd API, and to follow the kernel style guide. - Updates all callers to use the new API. There are no functional changes in this patch. Since there are no functional change, I felt it was okay to update all the callers in a single patch, since once the API is approved, the callers are mechanically changed. This patch is preparing the next patch in the series, which updates zstd to version 1.4.6. Since the upstream zstd API is no longer exposed to callers, the update can happen transparently. Signed-off-by: Nick Terrell --- crypto/zstd.c | 28 +- fs/btrfs/zstd.c| 68 +- fs/f2fs/compress.c | 56 +- fs/pstore/platform.c |2 +- fs/squashfs/zstd_wrapper.c | 16 +- include/linux/zstd.h | 1218 include/linux/zstd_lib.h | 1157 ++ lib/decompress_unzstd.c| 42 +- lib/zstd/compress.c| 107 ++-- lib/zstd/decompress.c | 112 ++-- 10 files changed, 1657 insertions(+), 1149 deletions(-) create mode 100644 include/linux/zstd_lib.h diff --git a/crypto/zstd.c b/crypto/zstd.c index 1a3309f066f7..154a969c83a8 100644 --- a/crypto/zstd.c +++ b/crypto/zstd.c @@ -18,22 +18,22 @@ #define ZSTD_DEF_LEVEL 3 struct zstd_ctx { - ZSTD_CCtx *cctx; - ZSTD_DCtx *dctx; + zstd_cctx *cctx; + zstd_dctx *dctx; void *cwksp; void *dwksp; }; -static ZSTD_parameters zstd_params(void) +static zstd_parameters zstd_params(void) { - return ZSTD_getParams(ZSTD_DEF_LEVEL, 0, 0); + return zstd_get_params(ZSTD_DEF_LEVEL, 0); } static int zstd_comp_init(struct zstd_ctx *ctx) { int ret = 0; - const ZSTD_parameters params = zstd_params(); - const size_t wksp_size = ZSTD_CCtxWorkspaceBound(params.cParams); + const zstd_parameters params = zstd_params(); + const size_t wksp_size = zstd_cctx_workspace_bound(); ctx->cwksp = vzalloc(wksp_size); if (!ctx->cwksp) { @@ -41,7 +41,7 @@ static int zstd_comp_init(struct zstd_ctx *ctx) goto out; } - ctx->cctx = ZSTD_initCCtx(ctx->cwksp, wksp_size); + ctx->cctx = zstd_init_cctx(ctx->cwksp, wksp_size); if (!ctx->cctx) { ret = -EINVAL; goto out_free; @@ -56,7 +56,7 @@ static int zstd_comp_init(struct zstd_ctx *ctx) static int zstd_decomp_init(struct zstd_ctx *ctx) { int ret = 0; - const size_t wksp_size = ZSTD_DCtxWorkspaceBound(); + const size_t wksp_size = zstd_dctx_workspace_bound(); ctx->dwksp = vzalloc(wksp_size); if (!ctx->dwksp) { @@ -64,7 +64,7 @@ static int zstd_decomp_init(struct zstd_ctx *ctx) goto out; } - ctx->dctx = ZSTD_initDCtx(ctx->dwksp, wksp_size); + ctx->dctx = zstd_init_dctx(ctx->dwksp, wksp_size); if (!ctx->dctx) { ret = -EINVAL; goto out_free; @@ -152,10 +152,10 @@ static int __zstd_compress(const u8 *src, unsigned int slen, { size_t out_len; struct zstd_ctx *zctx = ctx; - const ZSTD_parameters params = zstd_params(); + const zstd_parameters params = zstd_params(); - out_len = ZSTD_compressCCtx(zctx->cctx, dst, *dlen, src, slen, params); - if (ZSTD_isError(out_len)) + out_len = zstd_compress_cctx(zctx->cctx, dst, *dlen, src, slen, ); + if (zstd_is_error(out_len)) return -EINVAL; *dlen = out_len; return 0; @@ -182,8 +182,8 @@ static int __zstd_decompress(const u8 *src, unsigned int slen, size_t out_len; struct zstd_ctx *zctx = ctx; - out_len = ZSTD_decompressDCtx(zctx->dctx, dst, *dlen, src, slen); - if (ZSTD_isError(out_len)) + out_len = zstd_decompress_dctx(zctx->dctx, dst, *dlen, src, slen); + if (zstd_is_error(out_len)) return -EINVAL; *dlen = out_len; return 0; diff --git a/fs/btrfs/zstd.c b/fs/btrfs/zstd.c index 9a4871636c6c..c8cf690013f3 100644 --- a/fs/btrfs/zstd.c +++ b/fs/btrfs/zstd.c @@ -28,10 +28,10 @@ /* 307s to avoid pathologically clashing with transaction commit */ #define ZSTD_BTRFS_RECLAIM_JIFFIES (307 * HZ) -static ZSTD_parameters zstd_get_btrfs_parameters(unsigned int level, +static zstd_parameters zstd_get_btrfs_parameters(unsigned int level, size_t src_len) { - ZSTD_parameters params = ZSTD_getParams(level, src_len, 0); + zstd_parameters params = zstd_get_params(level, src_len); if (params.cParams.windowLog > ZSTD_B
[PATCH v7 0/3] Update to zstd-1.4.6
From: Nick Terrell Please pull from g...@github.com:terrelln/linux.git tags/v7-zstd-1.4.6 to get these changes. Alternatively the patchset is included. This patchset upgrades the zstd library to the latest upstream release. The current zstd version in the kernel is a modified version of upstream zstd-1.3.1. At the time it was integrated, zstd wasn't ready to be used in the kernel as-is. But, it is now possible to use upstream zstd directly in the kernel. I have not yet released zstd-1.4.6 upstream. I want the zstd version in the kernel to match up with a known upstream release, so we know exactly what code is running. Whenever this patchset is ready for merge, I will cut a release at the upstream commit that gets merged. This should not be necessary for future releases. The kernel zstd library is automatically generated from upstream zstd. A script makes the necessary changes and imports it into the kernel. The changes are: 1. Replace all libc dependencies with kernel replacements and rewrite includes. 2. Remove unncessary portability macros like: #if defined(_MSC_VER). 3. Use the kernel xxhash instead of bundling it. This automation gets tested every commit by upstream's continuous integration. When we cut a new zstd release, we will submit a patch to the kernel to update the zstd version in the kernel. I've updated zstd to upstream with one big patch because every commit must build, so that precludes partial updates. Since the commit is 100% generated, I hope the review burden is lightened. I considered replaying upstream commits, but that is not possible because there have been ~3500 upstream commits since the last zstd import, and the commits don't all build individually. The bulk update preserves bisectablity because bugs can be bisected to the zstd version update. At that point the update can be reverted, and we can work with upstream to find and fix the bug. After this big switch in how the kernel consumes zstd, future patches will be smaller, because they will only have one upstream release worth of changes each. This patchset adds a new kernel-style wrapper around zstd. This wrapper API is functionally equivalent to the subset of the current zstd API that is currently used. The wrapper API changes to be kernel style so that the symbols don't collide with zstd's symbols. The update to zstd-1.4.6 maintains the same API and preserves the semantics, so that none of the callers need to be updated. This patchset comes in 2 parts: 1. The first 2 patches prepare for the zstd upgrade. The first patch adds the new kernel style API so zstd can be upgraded without modifying any callers. The second patch adds an indirection for the lib/decompress_unzstd.c including of all decompression source files. 2. Import zstd-1.4.6. This patch is completely generated from upstream using automated tooling. I tested every caller of zstd on x86_64. I tested both after the 1.4.6 upgrade using the compatibility wrapper, and after the final patch in this series. I tested kernel and initramfs decompression in i386 and arm. I ran benchmarks to compare the current zstd in the kernel with zstd-1.4.6. I benchmarked on x86_64 using QEMU with KVM enabled on an Intel i9-9900k. I found: * BtrFS zstd compression at levels 1 and 3 is 5% faster * BtrFS zstd decompression+read is 15% faster * SquashFS zstd decompression+read is 15% faster * F2FS zstd compression+write at level 3 is 8% faster * F2FS zstd decompression+read is 20% faster * ZRAM decompression+read is 30% faster * Kernel zstd decompression is 35% faster * Initramfs zstd decompression+build is 5% faster The latest zstd also offers bug fixes and a 1 KB reduction in stack uage during compression. For example the recent problem with large kernel decompression has been fixed upstream for over 2 years https://lkml.org/lkml/2020/9/29/27. Please let me know if there is anything that I can do to ease the way for these patches. I think it is important because it gets large performance improvements, contains bug fixes, and is switching to a more maintainable model of consuming upstream zstd directly, making it easy to keep up to date. Best, Nick Terrell v1 -> v2: * Successfully tested F2FS with help from Chao Yu to fix my test. * (1/9) Fix ZSTD_initCStream() wrapper to handle pledged_src_size=0 means unknown. This fixes F2FS with the zstd-1.4.6 compatibility wrapper, exposed by the test. v2 -> v3: * (3/9) Silence warnings by Kernel Test Robot: https://github.com/facebook/zstd/pull/2324 Stack size warnings remain, but these aren't new, and the functions it warns on are either unused or not in the maximum stack path. This patchset reduces zstd compression stack usage by 1 KB overall. I've gotten the low hanging fruit, and more stack reduction would require significant changes that have the potential to introduce new bugs. However, I do hope to continue to reduce zstd stack usage in future versions. v3 -> v4: * (3/9) F
Re: [PATCH v6 1/3] lib: zstd: Add kernel-specific API
> On Dec 2, 2020, at 9:03 PM, Michał Mirosław wrote: > > On Thu, Dec 03, 2020 at 03:59:21AM +0000, Nick Terrell wrote: >> On Dec 2, 2020, at 7:14 PM, Michał Mirosław wrote: >>> On Thu, Dec 03, 2020 at 01:42:03AM +, Nick Terrell wrote: >>>> On Dec 2, 2020, at 5:16 PM, Michał Mirosław >>>> wrote: >>>>> On Wed, Dec 02, 2020 at 12:32:40PM -0800, Nick Terrell wrote: >>>>>> From: Nick Terrell >>>>>> >>>>>> This patch: >>>>>> - Moves `include/linux/zstd.h` -> `lib/zstd/zstd.h` >>>>>> - Adds a new API in `include/linux/zstd.h` that is functionally >>>>>> equivalent to the in-use subset of the current API. Functions are >>>>>> renamed to avoid symbol collisions with zstd, to make it clear it is >>>>>> not the upstream zstd API, and to follow the kernel style guide. >>>>>> - Updates all callers to use the new API. >>>>>> >>>>>> There are no functional changes in this patch. Since there are no >>>>>> functional change, I felt it was okay to update all the callers in a >>>>>> single patch, since once the API is approved, the callers are >>>>>> mechanically changed. >>>>> [...] >>>>>> --- a/lib/decompress_unzstd.c >>>>>> +++ b/lib/decompress_unzstd.c >>>>> [...] >>>>>> static int INIT handle_zstd_error(size_t ret, void (*error)(char *x)) >>>>>> { >>>>>> -const int err = ZSTD_getErrorCode(ret); >>>>>> - >>>>>> -if (!ZSTD_isError(ret)) >>>>>> +if (!zstd_is_error(ret)) >>>>>> return 0; >>>>>> >>>>>> -switch (err) { >>>>>> -case ZSTD_error_memory_allocation: >>>>>> -error("ZSTD decompressor ran out of memory"); >>>>>> -break; >>>>>> -case ZSTD_error_prefix_unknown: >>>>>> -error("Input is not in the ZSTD format (wrong magic >>>>>> bytes)"); >>>>>> -break; >>>>>> -case ZSTD_error_dstSize_tooSmall: >>>>>> -case ZSTD_error_corruption_detected: >>>>>> -case ZSTD_error_checksum_wrong: >>>>>> -error("ZSTD-compressed data is corrupt"); >>>>>> -break; >>>>>> -default: >>>>>> -error("ZSTD-compressed data is probably corrupt"); >>>>>> -break; >>>>>> -} >>>>>> +error("ZSTD decompression failed"); >>>>>> return -1; >>>>>> } >>>>> >>>>> This looses diagnostics specificity - is this intended? At least the >>>>> out-of-memory condition seems useful to distinguish. >>>> >>>> Good point. The zstd API no longer exposes the error code enum, >>>> but it does expose zstd_get_error_name() which can be used here. >>>> I was thinking that the string needed to be static for some reason, but >>>> that is not the case. I will make that change. >>>> >>>>>> +size_t zstd_compress_stream(zstd_cstream *cstream, >>>>>> +struct zstd_out_buffer *output, struct zstd_in_buffer *input) >>>>>> +{ >>>>>> +ZSTD_outBuffer o; >>>>>> +ZSTD_inBuffer i; >>>>>> +size_t ret; >>>>>> + >>>>>> +memcpy(, output, sizeof(o)); >>>>>> +memcpy(, input, sizeof(i)); >>>>>> +ret = ZSTD_compressStream(cstream, , ); >>>>>> +memcpy(output, , sizeof(o)); >>>>>> +memcpy(input, , sizeof(i)); >>>>>> +return ret; >>>>>> +} >>>>> >>>>> Is all this copying necessary? How is it different from type-punning by >>>>> direct pointer cast? >>>> >>>> If breaking strict aliasing and type-punning by pointer casing is okay, >>>> then >>>> we can do that here. These memcpys will be negligible for performance, but >>>> type-punning would be more succinct if allowed. >>> >>> Ah, this might break LTO builds due to strict aliasing violation. >>> So I would suggest to just #define the ZSTD names to kernel ones >>> for the library code. Unless there is a cleaner solution... >> >> I don’t want to do that because I want in the 3rd series of the patchset I >> update >> to zstd-1.4.6. And I’m using zstd-1.4.6 as-is in upstream. This allows us to >> keep >> the kernel version up to date, since the patch to update to a new version >> can be >> generated automatically (and manually tested), so it doesn’t fall years >> behind >> upstream again. >> >> The alternative would be to make upstream zstd’s header public and >> #define zstd_in_buffer ZSTD_inBuffer. But that would make zstd’s header >> public, which would somewhat defeat the purpose of having a kernel wrapper. > > I thought the problem was API style spill-over from the library to other parts > of the kernel. A header-only wrapper can stop this. I'm not sure symbol > visibility (namespace pollution) was a concern. Thats true. It seems slightly unclean, but so Is duplicating these structs and memcpying them. So I’ll go ahead and expose the upstream zstd’s header (“lib/zstd/zstd.h” here). I’ll just need to pick a name for the upstream “zstd.h” header. > Best Regards > Michał Mirosław
Re: [PATCH v6 1/3] lib: zstd: Add kernel-specific API
> On Dec 2, 2020, at 7:14 PM, Michał Mirosław wrote: > > On Thu, Dec 03, 2020 at 01:42:03AM +0000, Nick Terrell wrote: >> >> >>> On Dec 2, 2020, at 5:16 PM, Michał Mirosław wrote: >>> >>> On Wed, Dec 02, 2020 at 12:32:40PM -0800, Nick Terrell wrote: >>>> From: Nick Terrell >>>> >>>> This patch: >>>> - Moves `include/linux/zstd.h` -> `lib/zstd/zstd.h` >>>> - Adds a new API in `include/linux/zstd.h` that is functionally >>>> equivalent to the in-use subset of the current API. Functions are >>>> renamed to avoid symbol collisions with zstd, to make it clear it is >>>> not the upstream zstd API, and to follow the kernel style guide. >>>> - Updates all callers to use the new API. >>>> >>>> There are no functional changes in this patch. Since there are no >>>> functional change, I felt it was okay to update all the callers in a >>>> single patch, since once the API is approved, the callers are >>>> mechanically changed. >>> [...] >>>> --- a/lib/decompress_unzstd.c >>>> +++ b/lib/decompress_unzstd.c >>> [...] >>>> static int INIT handle_zstd_error(size_t ret, void (*error)(char *x)) >>>> { >>>> - const int err = ZSTD_getErrorCode(ret); >>>> - >>>> - if (!ZSTD_isError(ret)) >>>> + if (!zstd_is_error(ret)) >>>>return 0; >>>> >>>> - switch (err) { >>>> - case ZSTD_error_memory_allocation: >>>> - error("ZSTD decompressor ran out of memory"); >>>> - break; >>>> - case ZSTD_error_prefix_unknown: >>>> - error("Input is not in the ZSTD format (wrong magic bytes)"); >>>> - break; >>>> - case ZSTD_error_dstSize_tooSmall: >>>> - case ZSTD_error_corruption_detected: >>>> - case ZSTD_error_checksum_wrong: >>>> - error("ZSTD-compressed data is corrupt"); >>>> - break; >>>> - default: >>>> - error("ZSTD-compressed data is probably corrupt"); >>>> - break; >>>> - } >>>> + error("ZSTD decompression failed"); >>>>return -1; >>>> } >>> >>> This looses diagnostics specificity - is this intended? At least the >>> out-of-memory condition seems useful to distinguish. >> >> Good point. The zstd API no longer exposes the error code enum, >> but it does expose zstd_get_error_name() which can be used here. >> I was thinking that the string needed to be static for some reason, but >> that is not the case. I will make that change. >> >>>> +size_t zstd_compress_stream(zstd_cstream *cstream, >>>> + struct zstd_out_buffer *output, struct zstd_in_buffer *input) >>>> +{ >>>> + ZSTD_outBuffer o; >>>> + ZSTD_inBuffer i; >>>> + size_t ret; >>>> + >>>> + memcpy(, output, sizeof(o)); >>>> + memcpy(, input, sizeof(i)); >>>> + ret = ZSTD_compressStream(cstream, , ); >>>> + memcpy(output, , sizeof(o)); >>>> + memcpy(input, , sizeof(i)); >>>> + return ret; >>>> +} >>> >>> Is all this copying necessary? How is it different from type-punning by >>> direct pointer cast? >> >> If breaking strict aliasing and type-punning by pointer casing is okay, then >> we can do that here. These memcpys will be negligible for performance, but >> type-punning would be more succinct if allowed. > > Ah, this might break LTO builds due to strict aliasing violation. > So I would suggest to just #define the ZSTD names to kernel ones > for the library code. Unless there is a cleaner solution... I don’t want to do that because I want in the 3rd series of the patchset I update to zstd-1.4.6. And I’m using zstd-1.4.6 as-is in upstream. This allows us to keep the kernel version up to date, since the patch to update to a new version can be generated automatically (and manually tested), so it doesn’t fall years behind upstream again. The alternative would be to make upstream zstd’s header public and #define zstd_in_buffer ZSTD_inBuffer. But that would make zstd’s header public, which would somewhat defeat the purpose of having a kernel wrapper. These memcpy’s won’t hurt performance, since this function is called at most every 4KB of input data in all the callers, though they are clunky.
Re: [PATCH v6 1/3] lib: zstd: Add kernel-specific API
> On Dec 2, 2020, at 5:16 PM, Michał Mirosław wrote: > > On Wed, Dec 02, 2020 at 12:32:40PM -0800, Nick Terrell wrote: >> From: Nick Terrell >> >> This patch: >> - Moves `include/linux/zstd.h` -> `lib/zstd/zstd.h` >> - Adds a new API in `include/linux/zstd.h` that is functionally >> equivalent to the in-use subset of the current API. Functions are >> renamed to avoid symbol collisions with zstd, to make it clear it is >> not the upstream zstd API, and to follow the kernel style guide. >> - Updates all callers to use the new API. >> >> There are no functional changes in this patch. Since there are no >> functional change, I felt it was okay to update all the callers in a >> single patch, since once the API is approved, the callers are >> mechanically changed. > [...] >> --- a/lib/decompress_unzstd.c >> +++ b/lib/decompress_unzstd.c > [...] >> static int INIT handle_zstd_error(size_t ret, void (*error)(char *x)) >> { >> -const int err = ZSTD_getErrorCode(ret); >> - >> -if (!ZSTD_isError(ret)) >> +if (!zstd_is_error(ret)) >> return 0; >> >> -switch (err) { >> -case ZSTD_error_memory_allocation: >> -error("ZSTD decompressor ran out of memory"); >> -break; >> -case ZSTD_error_prefix_unknown: >> -error("Input is not in the ZSTD format (wrong magic bytes)"); >> -break; >> -case ZSTD_error_dstSize_tooSmall: >> -case ZSTD_error_corruption_detected: >> -case ZSTD_error_checksum_wrong: >> -error("ZSTD-compressed data is corrupt"); >> -break; >> -default: >> -error("ZSTD-compressed data is probably corrupt"); >> -break; >> -} >> +error("ZSTD decompression failed"); >> return -1; >> } > > This looses diagnostics specificity - is this intended? At least the > out-of-memory condition seems useful to distinguish. Good point. The zstd API no longer exposes the error code enum, but it does expose zstd_get_error_name() which can be used here. I was thinking that the string needed to be static for some reason, but that is not the case. I will make that change. >> +size_t zstd_compress_stream(zstd_cstream *cstream, >> +struct zstd_out_buffer *output, struct zstd_in_buffer *input) >> +{ >> +ZSTD_outBuffer o; >> +ZSTD_inBuffer i; >> +size_t ret; >> + >> +memcpy(, output, sizeof(o)); >> +memcpy(, input, sizeof(i)); >> +ret = ZSTD_compressStream(cstream, , ); >> +memcpy(output, , sizeof(o)); >> +memcpy(input, , sizeof(i)); >> +return ret; >> +} > > Is all this copying necessary? How is it different from type-punning by > direct pointer cast? If breaking strict aliasing and type-punning by pointer casing is okay, then we can do that here. These memcpys will be negligible for performance, but type-punning would be more succinct if allowed. Best, Nick
[PATCH v6 2/3] lib: zstd: Add decompress_sources.h for decompress_unzstd
From: Nick Terrell Adds decompress_sources.h which includes every .c file necessary for zstd decompression. This is used in decompress_unzstd.c so the internal structure of the library isn't exposed. This allows us to upgrade the zstd library version without modifying any callers. Instead we just need to update decompress_sources.h. Signed-off-by: Nick Terrell --- lib/decompress_unzstd.c | 6 +- lib/zstd/decompress_sources.h | 14 ++ 2 files changed, 15 insertions(+), 5 deletions(-) create mode 100644 lib/zstd/decompress_sources.h diff --git a/lib/decompress_unzstd.c b/lib/decompress_unzstd.c index 87ff567fd76d..d42281d7d416 100644 --- a/lib/decompress_unzstd.c +++ b/lib/decompress_unzstd.c @@ -68,11 +68,7 @@ #ifdef STATIC # define UNZSTD_PREBOOT # include "xxhash.c" -# include "zstd/entropy_common.c" -# include "zstd/fse_decompress.c" -# include "zstd/huf_decompress.c" -# include "zstd/zstd_common.c" -# include "zstd/decompress.c" +# include "zstd/decompress_sources.h" #endif #include diff --git a/lib/zstd/decompress_sources.h b/lib/zstd/decompress_sources.h new file mode 100644 index ..d2fe10af0043 --- /dev/null +++ b/lib/zstd/decompress_sources.h @@ -0,0 +1,14 @@ +/* SPDX-License-Identifier: GPL-2.0-only */ + +/* + * This file includes every .c file needed for decompression. + * It is used by lib/decompress_unzstd.c to include the decompression + * source into the translation-unit, so it can be used for kernel + * decompression. + */ + +#include "entropy_common.c" +#include "fse_decompress.c" +#include "huf_decompress.c" +#include "zstd_common.c" +#include "decompress.c" -- 2.29.2
[PATCH v6 0/3] Update to zstd-1.4.6
From: Nick Terrell Please pull from g...@github.com:terrelln/linux.git tags/zstd-1.4.6-v6 to get these changes. Alternatively the patchset is included. This patchset upgrades the zstd library to the latest upstream release. The current zstd version in the kernel is a modified version of upstream zstd-1.3.1. At the time it was integrated, zstd wasn't ready to be used in the kernel as-is. But, it is now possible to use upstream zstd directly in the kernel. I have not yet released zstd-1.4.6 upstream. I want the zstd version in the kernel to match up with a known upstream release, so we know exactly what code is running. Whenever this patchset is ready for merge, I will cut a release at the upstream commit that gets merged. This should not be necessary for future releases. The kernel zstd library is automatically generated from upstream zstd. A script makes the necessary changes and imports it into the kernel. The changes are: 1. Replace all libc dependencies with kernel replacements and rewrite includes. 2. Remove unncessary portability macros like: #if defined(_MSC_VER). 3. Use the kernel xxhash instead of bundling it. This automation gets tested every commit by upstream's continuous integration. When we cut a new zstd release, we will submit a patch to the kernel to update the zstd version in the kernel. I've updated zstd to upstream with one big patch because every commit must build, so that precludes partial updates. Since the commit is 100% generated, I hope the review burden is lightened. I considered replaying upstream commits, but that is not possible because there have been ~3500 upstream commits since the last zstd import, and the commits don't all build individually. The bulk update preserves bisectablity because bugs can be bisected to the zstd version update. At that point the update can be reverted, and we can work with upstream to find and fix the bug. After this big switch in how the kernel consumes zstd, future patches will be smaller, because they will only have one upstream release worth of changes each. This patchset adds a new kernel-style wrapper around zstd. This wrapper API is functionally equivalent to the subset of the current zstd API that is currently used. The wrapper API changes to be kernel style so that the symbols don't collide with zstd's symbols. The update to zstd-1.4.6 maintains the same API and preserves the semantics, so that none of the callers need to be updated. This patchset comes in 2 parts: 1. The first 2 patches prepare for the zstd upgrade. The first patch adds the new kernel style API so zstd can be upgraded without modifying any callers. The second patch adds an indirection for the lib/decompress_unzstd.c including of all decompression source files. 2. Import zstd-1.4.6. This patch is completely generated from upstream using automated tooling. I tested every caller of zstd on x86_64. I tested both after the 1.4.6 upgrade using the compatibility wrapper, and after the final patch in this series. I tested kernel and initramfs decompression in i386 and arm. I ran benchmarks to compare the current zstd in the kernel with zstd-1.4.6. I benchmarked on x86_64 using QEMU with KVM enabled on an Intel i9-9900k. I found: * BtrFS zstd compression at levels 1 and 3 is 5% faster * BtrFS zstd decompression+read is 15% faster * SquashFS zstd decompression+read is 15% faster * F2FS zstd compression+write at level 3 is 8% faster * F2FS zstd decompression+read is 20% faster * ZRAM decompression+read is 30% faster * Kernel zstd decompression is 35% faster * Initramfs zstd decompression+build is 5% faster The latest zstd also offers bug fixes and a 1 KB reduction in stack uage during compression. For example the recent problem with large kernel decompression has been fixed upstream for over 2 years https://lkml.org/lkml/2020/9/29/27. Please let me know if there is anything that I can do to ease the way for these patches. I think it is important because it gets large performance improvements, contains bug fixes, and is switching to a more maintainable model of consuming upstream zstd directly, making it easy to keep up to date. Best, Nick Terrell v1 -> v2: * Successfully tested F2FS with help from Chao Yu to fix my test. * (1/9) Fix ZSTD_initCStream() wrapper to handle pledged_src_size=0 means unknown. This fixes F2FS with the zstd-1.4.6 compatibility wrapper, exposed by the test. v2 -> v3: * (3/9) Silence warnings by Kernel Test Robot: https://github.com/facebook/zstd/pull/2324 Stack size warnings remain, but these aren't new, and the functions it warns on are either unused or not in the maximum stack path. This patchset reduces zstd compression stack usage by 1 KB overall. I've gotten the low hanging fruit, and more stack reduction would require significant changes that have the potential to introduce new bugs. However, I do hope to continue to reduce zstd stack usage in future versions. v3 -> v4: * (3/9) F
Re: [PATCH v2] lib/lz4: explicitly support in-place decompression
> On Nov 21, 2020, at 7:07 PM, Gao Xiang wrote: > > LZ4 final literal copy could be overlapped when doing > in-place decompression, so it's unsafe to just use memcpy() > on an optimized memcpy approach but memmove() instead. > > Upstream LZ4 has updated this years ago [1] (and the impact > is non-sensible [2] plus only a few bytes remain), this commit > just synchronizes LZ4 upstream code to the kernel side as well. > > It can be observed as EROFS in-place decompression failure > on specific files when X86_FEATURE_ERMS is unsupported, > memcpy() optimization of commit 59daa706fbec ("x86, mem: > Optimize memcpy by avoiding memory false dependece") will > be enabled then. > > Currently most modern x86-CPUs support ERMS, these CPUs just > use "rep movsb" approach so no problem at all. However, it can > still be verified with forcely disabling ERMS feature... > > arch/x86/lib/memcpy_64.S: >ALTERNATIVE_2 "jmp memcpy_orig", "", X86_FEATURE_REP_GOOD, \ > - "jmp memcpy_erms", X86_FEATURE_ERMS > + "jmp memcpy_orig", X86_FEATURE_ERMS > > We didn't observe any strange on arm64/arm/x86 platform before > since most memcpy() would behave in an increasing address order > ("copy upwards" [3]) and it's the correct order of in-place > decompression but it really needs an update to memmove() for sure > considering it's an undefined behavior according to the standard > and some unique optimization already exists in the kernel. > > [1] https://github.com/lz4/lz4/commit/33cb8518ac385835cc17be9a770b27b40cd0e15b > [2] https://github.com/lz4/lz4/pull/717#issuecomment-497818921 > [3] https://sourceware.org/bugzilla/show_bug.cgi?id=12518 > Cc: Yann Collet > Cc: Nick Terrell > Cc: Miao Xie > Cc: Chao Yu > Cc: Li Guifu > Cc: Guo Xuenan > Signed-off-by: Gao Xiang > --- > changes since v1: > - refine commit message; > - Cc more people. > > Hi Andrew, > > Could you kindly consider picking this patch up, although > the impact is EROFS but it touchs in-kernel lz4 library > anyway... > > Thanks, > Gao Xiang > > lib/lz4/lz4_decompress.c | 6 +- > lib/lz4/lz4defs.h| 1 + > 2 files changed, 6 insertions(+), 1 deletion(-) > > diff --git a/lib/lz4/lz4_decompress.c b/lib/lz4/lz4_decompress.c > index 00cb0d0b73e1..8a7724a6ce2f 100644 > --- a/lib/lz4/lz4_decompress.c > +++ b/lib/lz4/lz4_decompress.c > @@ -263,7 +263,11 @@ static FORCE_INLINE int LZ4_decompress_generic( > } > } > > - LZ4_memcpy(op, ip, length); > + /* > + * supports overlapping memory regions; only matters > + * for in-place decompression scenarios > + */ > + LZ4_memmove(op, ip, length); > ip += length; > op += length; > > diff --git a/lib/lz4/lz4defs.h b/lib/lz4/lz4defs.h > index c91dd96ef629..673bd206aa98 100644 > --- a/lib/lz4/lz4defs.h > +++ b/lib/lz4/lz4defs.h > @@ -146,6 +146,7 @@ static FORCE_INLINE void LZ4_writeLE16(void *memPtr, U16 > value) > * environments. This is needed when decompressing the Linux Kernel, for > example. > */ > #define LZ4_memcpy(dst, src, size) __builtin_memcpy(dst, src, size) > +#define LZ4_memmove(dst, src, size) __builtin_memmove(dst, src, size) > > static FORCE_INLINE void LZ4_copy8(void *dst, const void *src) > { > -- > 2.18.4 > Looks good to me! You can add: Reviewed-by: Nick Terrell
Re: [PATCH v5 1/9] lib: zstd: Add zstd compatibility wrapper
> On Nov 10, 2020, at 10:39 AM, Christoph Hellwig wrote: > > On Mon, Nov 09, 2020 at 02:01:41PM -0500, Chris Mason wrote: >> You do consistently ask for a shim layer, but you haven???t explained what >> we gain by diverging from the documented and tested API of the upstream zstd >> project. It???s an important discussion given that we hope to regularly >> update the kernel side as they make improvements in zstd. > > An API that looks like every other kernel API, and doesn't cause endless > amount of churn because someone decided they need a new API flavor of > the day. Btw, I'm not asking for a shim layer - that was the compromise > we ended up with. I will put up a version of the patch set with the shim layer. I will follow the kernel style guide for the shim, which will involve function renaming. I will prefix the functions with “zstd_” instead of “ZSTD_” to make it clear that this is not the upstream zstd API, but rather a kernel wrapper (and be closer to the style guide). Other than renaming to follow the kernel style guide, I will keep the API as similar as possible to the existing API, to minimize churn. Please let me know if you have any particular requests for the shim that I haven't mentioned, or if you would prefer something else. Alternatively, comment on the patches once I put them up. Expect them later this week or weekend. Best, Nick > If zstd folks can't maintain a sane code base maybe we should just drop > this childish churning code base from the tree.
Re: [PATCH v5 1/9] lib: zstd: Add zstd compatibility wrapper
> On Nov 10, 2020, at 7:25 AM, David Sterba wrote: > > On Mon, Nov 09, 2020 at 02:01:41PM -0500, Chris Mason wrote: >> On 6 Nov 2020, at 13:38, Christoph Hellwig wrote: >>> You just keep resedning this crap, don't you? Haven't you been told >>> multiple times to provide a proper kernel API by now? >> >> You do consistently ask for a shim layer, but you haven’t explained >> what we gain by diverging from the documented and tested API of the >> upstream zstd project. It’s an important discussion given that we >> hope to regularly update the kernel side as they make improvements in >> zstd. >> >> The only benefit described so far seems to be camelcase related, but if >> there are problems in the API beyond that, I haven’t seen you describe >> them. I don’t think the camelcase alone justifies the added costs of >> the shim. > > The API change in this patchset is adding churn that wouldn't be > necessary if there were an upstream<->kernel API from the beginning. > > The patch 5/9 is almost entirely renaming just some internal identifiers > > - ZSTD_CStreamWorkspaceBound(params.cParams), > - ZSTD_DStreamWorkspaceBound(ZSTD_BTRFS_MAX_INPUT)); > + > ZSTD_estimateCStreamSize_usingCParams(params.cParams), > + ZSTD_estimateDStreamSize(ZSTD_BTRFS_MAX_INPUT)); > > plus updating the names in the error strings. The compression API that > filesystems need is simple: > > - set up workspace and parameters > - compress buffer > - decompress buffer > > We really should not care if upstream has 3 functions for initializing > stream (ZSTD_initCStream/ZSTD_initStaticCStream/ZSTD_initCStream_advanced), > or if the name changes again in the future. Upstream will not change these function names. We guarantee the stable portion of our API has a fixed ABI. The unstable portion doesn’t make this guarantee, but we guarantee never to change function semantics in an incompatible way, and to provide long deprecation periods (years) when we delete functions. For reference, the only functions we’ve deleted/modified since v1.0.0 when we stabilized the zstd format 4 years ago are an old streaming API that was deprecated before v1.0.0. We’ve added new functions, and provided new recommended ways to use our API that we think are better. But, we highly value not breaking our users code, so all the older APIs are still supported. This churn is caused because the current version of zstd inside the kernel is not upstream zstd. At the time of integration upstream zstd wasn’t ready to be used as-is in the kernel. When I integrated zstd into the kernel, I should’ve done more work to use upstream as-is. It was a mistake that I would like to correct, so the kernel can benefit from the significant performance improvements that upstream has made in the last few years. > This should not require explicit explanation, this should be a natural > requirement especially for separate projects that don't share the same > coding style but have to be integrated in some way. I’m not completely against providing a kernel shim. Personally, I don’t believe it provides much benefit. But if the consensus of kernel developers is that a shim provides a better API, then I’m happy to provide it. So far, I haven’t seen a clear consensus either way. That leaves me kind of stuck. Best, Nick
Re: [GIT PULL][PATCH v5 0/9] Update to zstd-1.4.6
> On Nov 6, 2020, at 9:15 AM, Josef Bacik wrote: > > On 11/3/20 1:05 AM, Nick Terrell wrote: >> From: Nick Terrell >> Please pull from >> g...@github.com:terrelln/linux.git tags/v5-zstd-1.4.6 >> to get these changes. Alternatively the patchset is included. > > Where did we come down on the code formatting question? Personally I'm of > the mind that as long as the consumers themselves adhere to the proper coding > style I'm fine not maintaining the code style as long as we get the benefit > of easily syncing in code from the upstream project. Thanks, The general consensus of everyone who has been involved in the discussion so far, seems to be that the benefits of keeping zstd in-sync with upstream outweigh the cost of accepting upstream’s API, though not everyone agrees. The alternative is to provide a wrapper around upstream’s API, but this makes it slightly harder to debug, since you have to go through the wrapper whose only purpose is to adapt to the coding style, and allows bugs to sneak into the kernel implementation, which aren’t present upstream. Additionally, in 2017 LZ4 switched to using upstream LZ4’s API in order to stay up to date with upstream, which sets precedent. I also help maintain LZ4, and once the zstd update is merged, I plan to work on making it easier to update LZ4 in the kernel when upstream updates. That will be a much smaller change, since LZ4 is already nearly using upstream’s code directly. Best, Nick
Re: [PATCH v5 5/9] btrfs: zstd: Switch to the zstd-1.4.6 API
> On Nov 6, 2020, at 9:10 AM, Josef Bacik wrote: > > On 11/3/20 1:05 AM, Nick Terrell wrote: >> From: Nick Terrell >> Move away from the compatibility wrapper to the zstd-1.4.6 API. This >> code is functionally equivalent. >> Signed-off-by: Nick Terrell >> --- >> fs/btrfs/zstd.c | 48 >> 1 file changed, 28 insertions(+), 20 deletions(-) >> diff --git a/fs/btrfs/zstd.c b/fs/btrfs/zstd.c >> index a7367ff573d4..6b466e090cd7 100644 >> --- a/fs/btrfs/zstd.c >> +++ b/fs/btrfs/zstd.c >> @@ -16,7 +16,7 @@ >> #include >> #include >> #include >> -#include >> +#include >> #include "misc.h" >> #include "compression.h" >> #include "ctree.h" >> @@ -159,8 +159,8 @@ static void zstd_calc_ws_mem_sizes(void) >> zstd_get_btrfs_parameters(level, ZSTD_BTRFS_MAX_INPUT); >> size_t level_size = >> max_t(size_t, >> - ZSTD_CStreamWorkspaceBound(params.cParams), >> - ZSTD_DStreamWorkspaceBound(ZSTD_BTRFS_MAX_INPUT)); >> + >> ZSTD_estimateCStreamSize_usingCParams(params.cParams), >> + ZSTD_estimateDStreamSize(ZSTD_BTRFS_MAX_INPUT)); >> max_size = max_t(size_t, max_size, level_size); >> zstd_ws_mem_sizes[level - 1] = max_size; >> @@ -389,13 +389,23 @@ int zstd_compress_pages(struct list_head *ws, struct >> address_space *mapping, >> *total_in = 0; >> /* Initialize the stream */ >> -stream = ZSTD_initCStream(params, len, workspace->mem, >> -workspace->size); >> +stream = ZSTD_initStaticCStream(workspace->mem, workspace->size); >> if (!stream) { >> -pr_warn("BTRFS: ZSTD_initCStream failed\n"); >> +pr_warn("BTRFS: ZSTD_initStaticCStream failed\n"); >> ret = -EIO; >> goto out; >> } >> +{ >> +size_t ret2; >> + >> +ret2 = ZSTD_initCStream_advanced(stream, NULL, 0, params, len); >> +if (ZSTD_isError(ret2)) { >> +pr_warn("BTRFS: ZSTD_initCStream_advanced returned >> %s\n", >> +ZSTD_getErrorName(ret2)); >> +ret = -EIO; >> +goto out; >> +} >> +} > > Please don't do this, you can just add size_t ret2 at the top and not put > this in a block. Other than that the code looks fine, once you fix that you > can add Thanks for the review, I’ll make that change! > Reviewed-by: Josef Bacik > > Thanks, > > Josef
[PATCH v5 5/9] btrfs: zstd: Switch to the zstd-1.4.6 API
From: Nick Terrell Move away from the compatibility wrapper to the zstd-1.4.6 API. This code is functionally equivalent. Signed-off-by: Nick Terrell --- fs/btrfs/zstd.c | 48 1 file changed, 28 insertions(+), 20 deletions(-) diff --git a/fs/btrfs/zstd.c b/fs/btrfs/zstd.c index a7367ff573d4..6b466e090cd7 100644 --- a/fs/btrfs/zstd.c +++ b/fs/btrfs/zstd.c @@ -16,7 +16,7 @@ #include #include #include -#include +#include #include "misc.h" #include "compression.h" #include "ctree.h" @@ -159,8 +159,8 @@ static void zstd_calc_ws_mem_sizes(void) zstd_get_btrfs_parameters(level, ZSTD_BTRFS_MAX_INPUT); size_t level_size = max_t(size_t, - ZSTD_CStreamWorkspaceBound(params.cParams), - ZSTD_DStreamWorkspaceBound(ZSTD_BTRFS_MAX_INPUT)); + ZSTD_estimateCStreamSize_usingCParams(params.cParams), + ZSTD_estimateDStreamSize(ZSTD_BTRFS_MAX_INPUT)); max_size = max_t(size_t, max_size, level_size); zstd_ws_mem_sizes[level - 1] = max_size; @@ -389,13 +389,23 @@ int zstd_compress_pages(struct list_head *ws, struct address_space *mapping, *total_in = 0; /* Initialize the stream */ - stream = ZSTD_initCStream(params, len, workspace->mem, - workspace->size); + stream = ZSTD_initStaticCStream(workspace->mem, workspace->size); if (!stream) { - pr_warn("BTRFS: ZSTD_initCStream failed\n"); + pr_warn("BTRFS: ZSTD_initStaticCStream failed\n"); ret = -EIO; goto out; } + { + size_t ret2; + + ret2 = ZSTD_initCStream_advanced(stream, NULL, 0, params, len); + if (ZSTD_isError(ret2)) { + pr_warn("BTRFS: ZSTD_initCStream_advanced returned %s\n", + ZSTD_getErrorName(ret2)); + ret = -EIO; + goto out; + } + } /* map in the first page of input data */ in_page = find_get_page(mapping, start >> PAGE_SHIFT); @@ -421,8 +431,8 @@ int zstd_compress_pages(struct list_head *ws, struct address_space *mapping, ret2 = ZSTD_compressStream(stream, >out_buf, >in_buf); if (ZSTD_isError(ret2)) { - pr_debug("BTRFS: ZSTD_compressStream returned %d\n", - ZSTD_getErrorCode(ret2)); + pr_debug("BTRFS: ZSTD_compressStream returned %s\n", + ZSTD_getErrorName(ret2)); ret = -EIO; goto out; } @@ -489,8 +499,8 @@ int zstd_compress_pages(struct list_head *ws, struct address_space *mapping, ret2 = ZSTD_endStream(stream, >out_buf); if (ZSTD_isError(ret2)) { - pr_debug("BTRFS: ZSTD_endStream returned %d\n", - ZSTD_getErrorCode(ret2)); + pr_debug("BTRFS: ZSTD_endStream returned %s\n", + ZSTD_getErrorName(ret2)); ret = -EIO; goto out; } @@ -557,10 +567,9 @@ int zstd_decompress_bio(struct list_head *ws, struct compressed_bio *cb) unsigned long buf_start; unsigned long total_out = 0; - stream = ZSTD_initDStream( - ZSTD_BTRFS_MAX_INPUT, workspace->mem, workspace->size); + stream = ZSTD_initStaticDStream(workspace->mem, workspace->size); if (!stream) { - pr_debug("BTRFS: ZSTD_initDStream failed\n"); + pr_debug("BTRFS: ZSTD_initStaticDStream failed\n"); ret = -EIO; goto done; } @@ -579,8 +588,8 @@ int zstd_decompress_bio(struct list_head *ws, struct compressed_bio *cb) ret2 = ZSTD_decompressStream(stream, >out_buf, >in_buf); if (ZSTD_isError(ret2)) { - pr_debug("BTRFS: ZSTD_decompressStream returned %d\n", - ZSTD_getErrorCode(ret2)); + pr_debug("BTRFS: ZSTD_decompressStream returned %s\n", + ZSTD_getErrorName(ret2)); ret = -EIO; goto done; } @@ -633,10 +642,9 @@ int zstd_decompress(struct list_head *ws, unsigned char *data_in, unsigned long pg_offset = 0; char *kaddr; -
[PATCH v5 8/9] lib: unzstd: Switch to the zstd-1.4.6 API
From: Nick Terrell Move away from the compatibility wrapper to the zstd-1.4.6 API. This code is functionally equivalent. Signed-off-by: Nick Terrell --- lib/decompress_unzstd.c | 40 ++-- 1 file changed, 14 insertions(+), 26 deletions(-) diff --git a/lib/decompress_unzstd.c b/lib/decompress_unzstd.c index 3c6ad01ffcd5..efbe66501b34 100644 --- a/lib/decompress_unzstd.c +++ b/lib/decompress_unzstd.c @@ -73,7 +73,8 @@ #include #include -#include +#include +#include /* 128MB is the maximum window size supported by zstd. */ #define ZSTD_WINDOWSIZE_MAX(1 << ZSTD_WINDOWLOG_MAX) @@ -120,9 +121,9 @@ static int INIT decompress_single(const u8 *in_buf, long in_len, u8 *out_buf, long out_len, long *in_pos, void (*error)(char *x)) { - const size_t wksp_size = ZSTD_DCtxWorkspaceBound(); + const size_t wksp_size = ZSTD_estimateDCtxSize(); void *wksp = large_malloc(wksp_size); - ZSTD_DCtx *dctx = ZSTD_initDCtx(wksp, wksp_size); + ZSTD_DCtx *dctx = ZSTD_initStaticDCtx(wksp, wksp_size); int err; size_t ret; @@ -165,7 +166,6 @@ static int INIT __unzstd(unsigned char *in_buf, long in_len, { ZSTD_inBuffer in; ZSTD_outBuffer out; - ZSTD_frameParams params; void *in_allocated = NULL; void *out_allocated = NULL; void *wksp = NULL; @@ -234,36 +234,24 @@ static int INIT __unzstd(unsigned char *in_buf, long in_len, out.size = out_len; /* -* We need to know the window size to allocate the ZSTD_DStream. -* Since we are streaming, we need to allocate a buffer for the sliding -* window. The window size varies from 1 KB to ZSTD_WINDOWSIZE_MAX -* (8 MB), so it is important to use the actual value so as not to -* waste memory when it is smaller. +* Zstd determines the workspace size from the window size written +* into the frame header. This ensures that we use the minimum value +* possible, since the window size varies from 1 KB to ZSTD_WINDOWSIZE_MAX +* (1 GB), so it is very important to use the actual value. */ - ret = ZSTD_getFrameParams(, in.src, in.size); + wksp_size = ZSTD_estimateDStreamSize_fromFrame(in.src, in.size); err = handle_zstd_error(ret, error); if (err) goto out; - if (ret != 0) { - error("ZSTD-compressed data has an incomplete frame header"); - err = -1; - goto out; - } - if (params.windowSize > ZSTD_WINDOWSIZE_MAX) { - error("ZSTD-compressed data has too large a window size"); + wksp = large_malloc(wksp_size); + if (wksp == NULL) { + error("Out of memory while allocating ZSTD_DStream"); err = -1; goto out; } - - /* -* Allocate the ZSTD_DStream now that we know how much memory is -* required. -*/ - wksp_size = ZSTD_DStreamWorkspaceBound(params.windowSize); - wksp = large_malloc(wksp_size); - dstream = ZSTD_initDStream(params.windowSize, wksp, wksp_size); + dstream = ZSTD_initStaticDStream(wksp, wksp_size); if (dstream == NULL) { - error("Out of memory while allocating ZSTD_DStream"); + error("ZSTD_initStaticDStream failed"); err = -1; goto out; } -- 2.28.0
[PATCH v5 9/9] lib: zstd: Remove zstd compatibility wrapper
From: Nick Terrell All callers have been transitioned to the new zstd-1.4.6 API. There are no more callers of the zstd compatibility wrapper, so delete it. Signed-off-by: Nick Terrell --- include/linux/zstd_compat.h | 116 1 file changed, 116 deletions(-) delete mode 100644 include/linux/zstd_compat.h diff --git a/include/linux/zstd_compat.h b/include/linux/zstd_compat.h deleted file mode 100644 index cda9208bf04a.. --- a/include/linux/zstd_compat.h +++ /dev/null @@ -1,116 +0,0 @@ -/* - * Copyright (c) 2016-present, Facebook, Inc. - * All rights reserved. - * - * This source code is licensed under the BSD-style license found in the - * LICENSE file in the root directory of https://github.com/facebook/zstd. - * An additional grant of patent rights can be found in the PATENTS file in the - * same directory. - * - * This program is free software; you can redistribute it and/or modify it under - * the terms of the GNU General Public License version 2 as published by the - * Free Software Foundation. This program is dual-licensed; you may select - * either version 2 of the GNU General Public License ("GPL") or BSD license - * ("BSD"). - */ - -#ifndef ZSTD_COMPAT_H -#define ZSTD_COMPAT_H - -#include - -#if defined(ZSTD_VERSION_NUMBER) && (ZSTD_VERSION_NUMBER >= 10406) -/* - * This header provides backwards compatibility for the zstd-1.4.6 library - * upgrade. This header allows us to upgrade the zstd library version without - * modifying any callers. Then we will migrate callers from the compatibility - * wrapper one at a time until none remain. At which point we will delete this - * header. - * - * It is temporary and will be deleted once the upgrade is complete. - */ - -#include - -static inline size_t ZSTD_CCtxWorkspaceBound(ZSTD_compressionParameters compression_params) -{ -return ZSTD_estimateCCtxSize_usingCParams(compression_params); -} - -static inline size_t ZSTD_CStreamWorkspaceBound(ZSTD_compressionParameters compression_params) -{ -return ZSTD_estimateCStreamSize_usingCParams(compression_params); -} - -static inline size_t ZSTD_DCtxWorkspaceBound(void) -{ -return ZSTD_estimateDCtxSize(); -} - -static inline size_t ZSTD_DStreamWorkspaceBound(unsigned long long window_size) -{ -return ZSTD_estimateDStreamSize(window_size); -} - -static inline ZSTD_CCtx* ZSTD_initCCtx(void* wksp, size_t wksp_size) -{ -if (wksp == NULL) -return NULL; -return ZSTD_initStaticCCtx(wksp, wksp_size); -} - -static inline ZSTD_CStream* ZSTD_initCStream_compat(ZSTD_parameters params, uint64_t pledged_src_size, void* wksp, size_t wksp_size) -{ -ZSTD_CStream* cstream; -size_t ret; - -if (wksp == NULL) -return NULL; - -cstream = ZSTD_initStaticCStream(wksp, wksp_size); -if (cstream == NULL) -return NULL; - -/* 0 means unknown in old API but means 0 in new API */ -if (pledged_src_size == 0) -pledged_src_size = ZSTD_CONTENTSIZE_UNKNOWN; - -ret = ZSTD_initCStream_advanced(cstream, NULL, 0, params, pledged_src_size); -if (ZSTD_isError(ret)) -return NULL; - -return cstream; -} -#define ZSTD_initCStream ZSTD_initCStream_compat - -static inline ZSTD_DCtx* ZSTD_initDCtx(void* wksp, size_t wksp_size) -{ -if (wksp == NULL) -return NULL; -return ZSTD_initStaticDCtx(wksp, wksp_size); -} - -static inline ZSTD_DStream* ZSTD_initDStream_compat(unsigned long long window_size, void* wksp, size_t wksp_size) -{ -if (wksp == NULL) -return NULL; -(void)window_size; -return ZSTD_initStaticDStream(wksp, wksp_size); -} -#define ZSTD_initDStream ZSTD_initDStream_compat - -typedef ZSTD_frameHeader ZSTD_frameParams; - -static inline size_t ZSTD_getFrameParams(ZSTD_frameParams* frame_params, const void* src, size_t src_size) -{ -return ZSTD_getFrameHeader(frame_params, src, src_size); -} - -static inline size_t ZSTD_compressCCtx_compat(ZSTD_CCtx* cctx, void* dst, size_t dst_capacity, const void* src, size_t src_size, ZSTD_parameters params) -{ -return ZSTD_compress_advanced(cctx, dst, dst_capacity, src, src_size, NULL, 0, params); -} -#define ZSTD_compressCCtx ZSTD_compressCCtx_compat - -#endif /* ZSTD_VERSION_NUMBER >= 10406 */ -#endif /* ZSTD_COMPAT_H */ -- 2.28.0
[GIT PULL][PATCH v5 0/9] Update to zstd-1.4.6
From: Nick Terrell Please pull from g...@github.com:terrelln/linux.git tags/v5-zstd-1.4.6 to get these changes. Alternatively the patchset is included. This patchset upgrades the zstd library to the latest upstream release. The current zstd version in the kernel is a modified version of upstream zstd-1.3.1. At the time it was integrated, zstd wasn't ready to be used in the kernel as-is. But, it is now possible to use upstream zstd directly in the kernel. I have not yet release zstd-1.4.6 upstream. I want the zstd version in the kernel to match up with a known upstream release, so we know exactly what code is running. Whenever this patchset is ready for merge, I will cut a release at the upstream commit that gets merged. This should not be necessary for future releases. The kernel zstd library is automatically generated from upstream zstd. A script makes the necessary changes and imports it into the kernel. The changes are: 1. Replace all libc dependencies with kernel replacements and rewrite includes. 2. Remove unncessary portability macros like: #if defined(_MSC_VER). 3. Use the kernel xxhash instead of bundling it. This automation gets tested every commit by upstream's continuous integration. When we cut a new zstd release, we will submit a patch to the kernel to update the zstd version in the kernel. I've updated zstd to upstream with one big patch because every commit must build, so that precludes partial updates. Since the commit is 100% generated, I hope the review burden is lightened. I considered replaying upstream commits, but that is not possible because there have been ~3500 upstream commits since the last zstd import, and the commits don't all build individually. The bulk update preserves bisectablity because bugs can be bisected to the zstd version update. At that point the update can be reverted, and we can work with upstream to find and fix the bug. After this big switch in how the kernel consumes zstd, future patches will be smaller, because they will only have one upstream release worth of changes each. This patchset changes the zstd API from a custom kernel API to the upstream API. I considered wrapping the upstream API with a wrapper that is closer to the kernel style guide. Following advise from https://lkml.org/lkml/2020/9/17/814 I've chosen to use the upstream API directly, to minimize opportunities to introduce bugs, and because using the upstream API directly makes debugging and communication with upstream easier. This patchset comes in 3 parts: 1. The first 2 patches prepare for the zstd upgrade. The first patch adds a compatibility wrapper so zstd can be upgraded without modifying any callers. The second patch adds an indirection for the lib/decompress_unzstd.c including of all decompression source files. 2. Import zstd-1.4.6. This patch is completely generated from upstream using automated tooling. 3. Update all callers to the zstd-1.4.6 API then delete the compatibility wrapper. I tested every caller of zstd on x86_64. I tested both after the 1.4.6 upgrade using the compatibility wrapper, and after the final patch in this series. I tested kernel and initramfs decompression in i386 and arm. I ran benchmarks to compare the current zstd in the kernel with zstd-1.4.6. I benchmarked on x86_64 using QEMU with KVM enabled on an Intel i9-9900k. I found: * BtrFS zstd compression at levels 1 and 3 is 5% faster * BtrFS zstd decompression+read is 15% faster * SquashFS zstd decompression+read is 15% faster * F2FS zstd compression+write at level 3 is 8% faster * F2FS zstd decompression+read is 20% faster * ZRAM decompression+read is 30% faster * Kernel zstd decompression is 35% faster * Initramfs zstd decompression+build is 5% faster The latest zstd also offers bug fixes and a 1 KB reduction in stack uage during compression. For example the recent problem with large kernel decompression has been fixed upstream for over 2 years https://lkml.org/lkml/2020/9/29/27. Please let me know if there is anything that I can do to ease the way for these patches. I think it is important because it gets large performance improvements, contains bug fixes, and is switching to a more maintainable model of consuming upstream zstd directly, making it easy to keep up to date. Best, Nick Terrell v1 -> v2: * Successfully tested F2FS with help from Chao Yu to fix my test. * (1/9) Fix ZSTD_initCStream() wrapper to handle pledged_src_size=0 means unknown. This fixes F2FS with the zstd-1.4.6 compatibility wrapper, exposed by the test. v2 -> v3: * (3/9) Silence warnings by Kernel Test Robot: https://github.com/facebook/zstd/pull/2324 Stack size warnings remain, but these aren't new, and the functions it warns on are either unused or not in the maximum stack path. This patchset reduces zstd compression stack usage by 1 KB overall. I've gotten the low hanging fruit, and more stack reduction would require significant changes that have the potential to int
[PATCH v5 7/9] squashfs: zstd: Switch to the zstd-1.4.6 API
From: Nick Terrell Move away from the compatibility wrapper to the zstd-1.4.6 API. This code is functionally equivalent. Signed-off-by: Nick Terrell --- fs/squashfs/zstd_wrapper.c | 9 - 1 file changed, 4 insertions(+), 5 deletions(-) diff --git a/fs/squashfs/zstd_wrapper.c b/fs/squashfs/zstd_wrapper.c index f8c512a6204e..add582409866 100644 --- a/fs/squashfs/zstd_wrapper.c +++ b/fs/squashfs/zstd_wrapper.c @@ -11,7 +11,7 @@ #include #include #include -#include +#include #include #include "squashfs_fs.h" @@ -34,7 +34,7 @@ static void *zstd_init(struct squashfs_sb_info *msblk, void *buff) goto failed; wksp->window_size = max_t(size_t, msblk->block_size, SQUASHFS_METADATA_SIZE); - wksp->mem_size = ZSTD_DStreamWorkspaceBound(wksp->window_size); + wksp->mem_size = ZSTD_estimateDStreamSize(wksp->window_size); wksp->mem = vmalloc(wksp->mem_size); if (wksp->mem == NULL) goto failed; @@ -71,7 +71,7 @@ static int zstd_uncompress(struct squashfs_sb_info *msblk, void *strm, struct bvec_iter_all iter_all = {}; struct bio_vec *bvec = bvec_init_iter_all(_all); - stream = ZSTD_initDStream(wksp->window_size, wksp->mem, wksp->mem_size); + stream = ZSTD_initStaticDStream(wksp->mem, wksp->mem_size); if (!stream) { ERROR("Failed to initialize zstd decompressor\n"); @@ -122,8 +122,7 @@ static int zstd_uncompress(struct squashfs_sb_info *msblk, void *strm, break; if (ZSTD_isError(zstd_err)) { - ERROR("zstd decompression error: %d\n", - (int)ZSTD_getErrorCode(zstd_err)); + ERROR("zstd decompression error: %s\n", ZSTD_getErrorName(zstd_err)); error = -EIO; break; } -- 2.28.0
[PATCH v5 6/9] f2fs: zstd: Switch to the zstd-1.4.6 API
From: Nick Terrell Move away from the compatibility wrapper to the zstd-1.4.6 API. This code is more efficient because it uses the single-pass API instead of the streaming API. The streaming API is not necessary because the whole input and output buffers are available. This saves memory because we don't need to allocate a buffer for the window. It is also more efficient because it saves unnecessary memcpy calls. Compression memory increases from 168 KB to 204 KB because upstream uses slightly more memory. Decompression memory decreases from 1.4 MB to 158 KB. Signed-off-by: Nick Terrell --- fs/f2fs/compress.c | 101 + 1 file changed, 37 insertions(+), 64 deletions(-) diff --git a/fs/f2fs/compress.c b/fs/f2fs/compress.c index 57a6360b9827..8f8234877666 100644 --- a/fs/f2fs/compress.c +++ b/fs/f2fs/compress.c @@ -11,7 +11,8 @@ #include #include #include -#include +#include +#include #include "f2fs.h" #include "node.h" @@ -322,21 +323,21 @@ static const struct f2fs_compress_ops f2fs_lz4_ops = { static int zstd_init_compress_ctx(struct compress_ctx *cc) { ZSTD_parameters params; - ZSTD_CStream *stream; + ZSTD_CCtx *ctx; void *workspace; unsigned int workspace_size; params = ZSTD_getParams(F2FS_ZSTD_DEFAULT_CLEVEL, cc->rlen, 0); - workspace_size = ZSTD_CStreamWorkspaceBound(params.cParams); + workspace_size = ZSTD_estimateCCtxSize_usingCParams(params.cParams); workspace = f2fs_kvmalloc(F2FS_I_SB(cc->inode), workspace_size, GFP_NOFS); if (!workspace) return -ENOMEM; - stream = ZSTD_initCStream(params, 0, workspace, workspace_size); - if (!stream) { - printk_ratelimited("%sF2FS-fs (%s): %s ZSTD_initCStream failed\n", + ctx = ZSTD_initStaticCCtx(workspace, workspace_size); + if (!ctx) { + printk_ratelimited("%sF2FS-fs (%s): %s ZSTD_inittaticCStream failed\n", KERN_ERR, F2FS_I_SB(cc->inode)->sb->s_id, __func__); kvfree(workspace); @@ -344,7 +345,7 @@ static int zstd_init_compress_ctx(struct compress_ctx *cc) } cc->private = workspace; - cc->private2 = stream; + cc->private2 = ctx; cc->clen = cc->rlen - PAGE_SIZE - COMPRESS_HEADER_SIZE; return 0; @@ -359,66 +360,48 @@ static void zstd_destroy_compress_ctx(struct compress_ctx *cc) static int zstd_compress_pages(struct compress_ctx *cc) { - ZSTD_CStream *stream = cc->private2; - ZSTD_inBuffer inbuf; - ZSTD_outBuffer outbuf; - int src_size = cc->rlen; - int dst_size = src_size - PAGE_SIZE - COMPRESS_HEADER_SIZE; - int ret; - - inbuf.pos = 0; - inbuf.src = cc->rbuf; - inbuf.size = src_size; - - outbuf.pos = 0; - outbuf.dst = cc->cbuf->cdata; - outbuf.size = dst_size; + ZSTD_CCtx *ctx = cc->private2; + const size_t src_size = cc->rlen; + const size_t dst_size = src_size - PAGE_SIZE - COMPRESS_HEADER_SIZE; + ZSTD_parameters params = ZSTD_getParams(F2FS_ZSTD_DEFAULT_CLEVEL, src_size, 0); + size_t ret; - ret = ZSTD_compressStream(stream, , ); + ret = ZSTD_compress_advanced( + ctx, cc->cbuf->cdata, dst_size, cc->rbuf, src_size, NULL, 0, params); if (ZSTD_isError(ret)) { - printk_ratelimited("%sF2FS-fs (%s): %s ZSTD_compressStream failed, ret: %d\n", - KERN_ERR, F2FS_I_SB(cc->inode)->sb->s_id, - __func__, ZSTD_getErrorCode(ret)); - return -EIO; - } - - ret = ZSTD_endStream(stream, ); - if (ZSTD_isError(ret)) { - printk_ratelimited("%sF2FS-fs (%s): %s ZSTD_endStream returned %d\n", + /* +* there is compressed data remained in intermediate buffer due to +* no more space in cbuf.cdata +*/ + if (ZSTD_getErrorCode(ret) == ZSTD_error_dstSize_tooSmall) + return -EAGAIN; + /* other compression errors return -EIO */ + printk_ratelimited("%sF2FS-fs (%s): %s ZSTD_compress_advanced failed, err: %s\n", KERN_ERR, F2FS_I_SB(cc->inode)->sb->s_id, - __func__, ZSTD_getErrorCode(ret)); + __func__, ZSTD_getErrorName(ret)); return -EIO; } - /* -* there is compressed data remained in intermediate buffer due to -* no more space in cbuf.cdata -*/ - if (ret) - return -EAGAIN; - - cc->clen = outbuf.
[PATCH v5 4/9] crypto: zstd: Switch to zstd-1.4.6 API
From: Nick Terrell Move away from the compatibility wrapper to the zstd-1.4.6 API. This code is functionally equivalent. Signed-off-by: Nick Terrell --- crypto/zstd.c | 24 +++- 1 file changed, 11 insertions(+), 13 deletions(-) diff --git a/crypto/zstd.c b/crypto/zstd.c index dcda3cad3b5c..767fe2fbe009 100644 --- a/crypto/zstd.c +++ b/crypto/zstd.c @@ -11,7 +11,7 @@ #include #include #include -#include +#include #include @@ -24,16 +24,15 @@ struct zstd_ctx { void *dwksp; }; -static ZSTD_parameters zstd_params(void) -{ - return ZSTD_getParams(ZSTD_DEF_LEVEL, 0, 0); -} - static int zstd_comp_init(struct zstd_ctx *ctx) { int ret = 0; - const ZSTD_parameters params = zstd_params(); - const size_t wksp_size = ZSTD_CCtxWorkspaceBound(params.cParams); + const size_t wksp_size = ZSTD_estimateCCtxSize(ZSTD_DEF_LEVEL); + + if (ZSTD_isError(wksp_size)) { + ret = -EINVAL; + goto out_free; + } ctx->cwksp = vzalloc(wksp_size); if (!ctx->cwksp) { @@ -41,7 +40,7 @@ static int zstd_comp_init(struct zstd_ctx *ctx) goto out; } - ctx->cctx = ZSTD_initCCtx(ctx->cwksp, wksp_size); + ctx->cctx = ZSTD_initStaticCCtx(ctx->cwksp, wksp_size); if (!ctx->cctx) { ret = -EINVAL; goto out_free; @@ -56,7 +55,7 @@ static int zstd_comp_init(struct zstd_ctx *ctx) static int zstd_decomp_init(struct zstd_ctx *ctx) { int ret = 0; - const size_t wksp_size = ZSTD_DCtxWorkspaceBound(); + const size_t wksp_size = ZSTD_estimateDCtxSize(); ctx->dwksp = vzalloc(wksp_size); if (!ctx->dwksp) { @@ -64,7 +63,7 @@ static int zstd_decomp_init(struct zstd_ctx *ctx) goto out; } - ctx->dctx = ZSTD_initDCtx(ctx->dwksp, wksp_size); + ctx->dctx = ZSTD_initStaticDCtx(ctx->dwksp, wksp_size); if (!ctx->dctx) { ret = -EINVAL; goto out_free; @@ -152,9 +151,8 @@ static int __zstd_compress(const u8 *src, unsigned int slen, { size_t out_len; struct zstd_ctx *zctx = ctx; - const ZSTD_parameters params = zstd_params(); - out_len = ZSTD_compressCCtx(zctx->cctx, dst, *dlen, src, slen, params); + out_len = ZSTD_compressCCtx(zctx->cctx, dst, *dlen, src, slen, ZSTD_DEF_LEVEL); if (ZSTD_isError(out_len)) return -EINVAL; *dlen = out_len; -- 2.28.0
[PATCH v5 2/9] lib: zstd: Add decompress_sources.h for decompress_unzstd
From: Nick Terrell Adds decompress_sources.h which includes every .c file necessary for zstd decompression. This is used in decompress_unzstd.c so the internal structure of the library isn't exposed. This allows us to upgrade the zstd library version without modifying any callers. Instead we just need to update decompress_sources.h. Signed-off-by: Nick Terrell --- lib/decompress_unzstd.c | 6 +- lib/zstd/decompress_sources.h | 14 ++ 2 files changed, 15 insertions(+), 5 deletions(-) create mode 100644 lib/zstd/decompress_sources.h diff --git a/lib/decompress_unzstd.c b/lib/decompress_unzstd.c index 6bb805aeec08..3c6ad01ffcd5 100644 --- a/lib/decompress_unzstd.c +++ b/lib/decompress_unzstd.c @@ -68,11 +68,7 @@ #ifdef STATIC # define UNZSTD_PREBOOT # include "xxhash.c" -# include "zstd/entropy_common.c" -# include "zstd/fse_decompress.c" -# include "zstd/huf_decompress.c" -# include "zstd/zstd_common.c" -# include "zstd/decompress.c" +# include "zstd/decompress_sources.h" #endif #include diff --git a/lib/zstd/decompress_sources.h b/lib/zstd/decompress_sources.h new file mode 100644 index ..ccb4960ea0cd --- /dev/null +++ b/lib/zstd/decompress_sources.h @@ -0,0 +1,14 @@ +// SPDX-License-Identifier: GPL-2.0-only + +/* + * This file includes every .c file needed for decompression. + * It is used by lib/decompress_unzstd.c to include the decompression + * source into the translation-unit, so it can be used for kernel + * decompression. + */ + +#include "entropy_common.c" +#include "fse_decompress.c" +#include "huf_decompress.c" +#include "zstd_common.c" +#include "decompress.c" -- 2.28.0
[PATCH v5 1/9] lib: zstd: Add zstd compatibility wrapper
From: Nick Terrell Adds zstd_compat.h which provides the necessary functions from the current zstd.h API. It is only active for zstd versions 1.4.6 and newer. That means it is disabled currently, but will become active when a later patch in this series updates the zstd library in the kernel to 1.4.6. This header allows the zstd upgrade to 1.4.6 without changing any callers, since they all include zstd through the compatibility wrapper. Later patches in this series transition each caller away from the compatibility wrapper. After all the callers have been transitioned away from the compatibility wrapper, the final patch in this series deletes it. Signed-off-by: Nick Terrell --- crypto/zstd.c | 2 +- fs/btrfs/zstd.c | 2 +- fs/f2fs/compress.c | 2 +- fs/squashfs/zstd_wrapper.c | 2 +- include/linux/zstd_compat.h | 116 lib/decompress_unzstd.c | 2 +- 6 files changed, 121 insertions(+), 5 deletions(-) create mode 100644 include/linux/zstd_compat.h diff --git a/crypto/zstd.c b/crypto/zstd.c index 1a3309f066f7..dcda3cad3b5c 100644 --- a/crypto/zstd.c +++ b/crypto/zstd.c @@ -11,7 +11,7 @@ #include #include #include -#include +#include #include diff --git a/fs/btrfs/zstd.c b/fs/btrfs/zstd.c index 9a4871636c6c..a7367ff573d4 100644 --- a/fs/btrfs/zstd.c +++ b/fs/btrfs/zstd.c @@ -16,7 +16,7 @@ #include #include #include -#include +#include #include "misc.h" #include "compression.h" #include "ctree.h" diff --git a/fs/f2fs/compress.c b/fs/f2fs/compress.c index 14262e0f1cd6..57a6360b9827 100644 --- a/fs/f2fs/compress.c +++ b/fs/f2fs/compress.c @@ -11,7 +11,7 @@ #include #include #include -#include +#include #include "f2fs.h" #include "node.h" diff --git a/fs/squashfs/zstd_wrapper.c b/fs/squashfs/zstd_wrapper.c index b7cb1faa652d..f8c512a6204e 100644 --- a/fs/squashfs/zstd_wrapper.c +++ b/fs/squashfs/zstd_wrapper.c @@ -11,7 +11,7 @@ #include #include #include -#include +#include #include #include "squashfs_fs.h" diff --git a/include/linux/zstd_compat.h b/include/linux/zstd_compat.h new file mode 100644 index ..cda9208bf04a --- /dev/null +++ b/include/linux/zstd_compat.h @@ -0,0 +1,116 @@ +/* + * Copyright (c) 2016-present, Facebook, Inc. + * All rights reserved. + * + * This source code is licensed under the BSD-style license found in the + * LICENSE file in the root directory of https://github.com/facebook/zstd. + * An additional grant of patent rights can be found in the PATENTS file in the + * same directory. + * + * This program is free software; you can redistribute it and/or modify it under + * the terms of the GNU General Public License version 2 as published by the + * Free Software Foundation. This program is dual-licensed; you may select + * either version 2 of the GNU General Public License ("GPL") or BSD license + * ("BSD"). + */ + +#ifndef ZSTD_COMPAT_H +#define ZSTD_COMPAT_H + +#include + +#if defined(ZSTD_VERSION_NUMBER) && (ZSTD_VERSION_NUMBER >= 10406) +/* + * This header provides backwards compatibility for the zstd-1.4.6 library + * upgrade. This header allows us to upgrade the zstd library version without + * modifying any callers. Then we will migrate callers from the compatibility + * wrapper one at a time until none remain. At which point we will delete this + * header. + * + * It is temporary and will be deleted once the upgrade is complete. + */ + +#include + +static inline size_t ZSTD_CCtxWorkspaceBound(ZSTD_compressionParameters compression_params) +{ +return ZSTD_estimateCCtxSize_usingCParams(compression_params); +} + +static inline size_t ZSTD_CStreamWorkspaceBound(ZSTD_compressionParameters compression_params) +{ +return ZSTD_estimateCStreamSize_usingCParams(compression_params); +} + +static inline size_t ZSTD_DCtxWorkspaceBound(void) +{ +return ZSTD_estimateDCtxSize(); +} + +static inline size_t ZSTD_DStreamWorkspaceBound(unsigned long long window_size) +{ +return ZSTD_estimateDStreamSize(window_size); +} + +static inline ZSTD_CCtx* ZSTD_initCCtx(void* wksp, size_t wksp_size) +{ +if (wksp == NULL) +return NULL; +return ZSTD_initStaticCCtx(wksp, wksp_size); +} + +static inline ZSTD_CStream* ZSTD_initCStream_compat(ZSTD_parameters params, uint64_t pledged_src_size, void* wksp, size_t wksp_size) +{ +ZSTD_CStream* cstream; +size_t ret; + +if (wksp == NULL) +return NULL; + +cstream = ZSTD_initStaticCStream(wksp, wksp_size); +if (cstream == NULL) +return NULL; + +/* 0 means unknown in old API but means 0 in new API */ +if (pledged_src_size == 0) +pledged_src_size = ZSTD_CONTENTSIZE_UNKNOWN; + +ret = ZSTD_initCStream_advanced(cstream, NULL, 0, params, pledged_src_size); +if (ZSTD_isError(ret)) +return NULL; + +return cstream; +} +#define ZS
Re: [GIT PULL][PATCH v4 0/9] Update to zstd-1.4.6
> On Oct 1, 2020, at 3:18 AM, David Sterba wrote: > > On Wed, Sep 30, 2020 at 08:49:49PM +0000, Nick Terrell wrote: >>> On Sep 29, 2020, at 11:53 PM, Nick Terrell wrote: >>> >>> From: Nick Terrell >> >> It has been brought to my attention that patch 3 hasn’t made it to patchwork, >> likely because it is too large. I’ll include a pull request in the next >> cover letter, >> together with the patches (if needed). > > The patch 3/9 saved to a file is 1.6M, over 35000 lines, the diffstat > says: > > 66 files changed, 24268 insertions(+), 12889 deletions(-) > > Seriously, this is wrong in so many ways. There's the rationale for > one-time change etc, but the actual result is beyond what I would accept > and would not encourage anyone to merge as-is. I’m open to suggestions on how to get a zstd update done better. I don’t know of any way to break this patch up into smaller patches that all compile. The code is all generated directly from upstream and modified to work in the kernel by automated scripts. I think the benefits of updating zstd are pretty clear: bug fixes, 3 years of testing, features, debuggability, support from zstd upstream, and significant performance improvements. So I hope we can come up with a way forward to get this merged. This large of a patch is a one-time change. But, the zstd updates in general will be large, containing 100s of commits worth of changes (as opposed to ~3500 and a structure change in this diff). E.g. the upstream diff between two upstream versions range from 50KB - 500KB. Zstd is an actively maintained project, so there is going to be churn when consuming it. But it also means that we’re actively supporting the project if any problems occur. My view is that kernel developers don’t need to review upstreams zstd’s code. We should focus on the diff from upstream, and ensuring that everything works in the kernel environment. The imported code from upstream zstd is ~30K LOC, which is too large for anyone to reasonably review. As mentioned in the patch, this commit shows the diff from upstream zstd, which is much more manageable: https://github.com/terrelln/linux/commit/467c9ea1df1100db48c020c3c8b282a2a30f5116 I’ve generated it by importing upstream zstd as-is into the kernel file structure. Then running the automation to generate the kernel patch from upstream and importing it into the kernel on top of the upstream patch. Best, Nick
Re: [GIT PULL][PATCH v4 0/9] Update to zstd-1.4.6
> On Sep 29, 2020, at 11:53 PM, Nick Terrell wrote: > > From: Nick Terrell It has been brought to my attention that patch 3 hasn’t made it to patchwork, likely because it is too large. I’ll include a pull request in the next cover letter, together with the patches (if needed). Please pull from g...@github.com:terrelln/linux.git tags/v4-zstd-1.4.6 to get these changes. > This patchset upgrades the zstd library to the latest upstream release. The > current zstd version in the kernel is a modified version of upstream > zstd-1.3.1. > At the time it was integrated, zstd wasn't ready to be used in the kernel > as-is. > But, it is now possible to use upstream zstd directly in the kernel. > > I have not yet release zstd-1.4.6 upstream. I want the zstd version in the > kernel > to match up with a known upstream release, so we know exactly what code is > running. Whenever this patchset is ready for merge, I will cut a release at > the > upstream commit that gets merged. This should not be necessary for future > releases. > > The kernel zstd library is automatically generated from upstream zstd. A > script > makes the necessary changes and imports it into the kernel. The changes are: > > 1. Replace all libc dependencies with kernel replacements and rewrite > includes. > 2. Remove unncessary portability macros like: #if defined(_MSC_VER). > 3. Use the kernel xxhash instead of bundling it. > > This automation gets tested every commit by upstream's continuous integration. > When we cut a new zstd release, we will submit a patch to the kernel to update > the zstd version in the kernel. > > I've updated zstd to upstream with one big patch because every commit must > build, > so that precludes partial updates. Since the commit is 100% generated, I hope > the > review burden is lightened. I considered replaying upstream commits, but that > is > not possible because there have been ~3500 upstream commits since the last > zstd > import, and the commits don't all build individually. The bulk update > preserves > bisectablity because bugs can be bisected to the zstd version update. At that > point the update can be reverted, and we can work with upstream to find and > fix > the bug. After this big switch in how the kernel consumes zstd, future patches > will be smaller, because they will only have one upstream release worth of > changes each. > > This patchset changes the zstd API from a custom kernel API to the upstream > API. > I considered wrapping the upstream API with a wrapper that is closer to the > kernel style guide. Following advise from https://lkml.org/lkml/2020/9/17/814 > I've chosen to use the upstream API directly, to minimize opportunities to > introduce bugs, and because using the upstream API directly makes debugging > and > communication with upstream easier. > > This patchset comes in 3 parts: > 1. The first 2 patches prepare for the zstd upgrade. The first patch adds a > compatibility wrapper so zstd can be upgraded without modifying any callers. > The second patch adds an indirection for the lib/decompress_unzstd.c > including > of all decompression source files. > 2. Import zstd-1.4.6. This patch is completely generated from upstream using > automated tooling. > 3. Update all callers to the zstd-1.4.6 API then delete the compatibility > wrapper. > > I tested every caller of zstd on x86_64. I tested both after the 1.4.6 upgrade > using the compatibility wrapper, and after the final patch in this series. > > I tested kernel and initramfs decompression in i386 and arm. > > I ran benchmarks to compare the current zstd in the kernel with zstd-1.4.6. > I benchmarked on x86_64 using QEMU with KVM enabled on an Intel i9-9900k. > I found: > * BtrFS zstd compression at levels 1 and 3 is 5% faster > * BtrFS zstd decompression+read is 15% faster > * SquashFS zstd decompression+read is 15% faster > * F2FS zstd compression+write at level 3 is 8% faster > * F2FS zstd decompression+read is 20% faster > * ZRAM decompression+read is 30% faster > * Kernel zstd decompression is 35% faster > * Initramfs zstd decompression+build is 5% faster > > The latest zstd also offers bug fixes and a 1 KB reduction in stack uage > during > compression. For example the recent problem with large kernel decompression > has > been fixed upstream for over 2 years https://lkml.org/lkml/2020/9/29/27. > > Please let me know if there is anything that I can do to ease the way for > these > patches. I think it is important because it gets large performance > improvements, > contains bug fixes, and is switching to a more maintainable model of consuming > upstream zstd directly, making it easy to keep up to date.
Re: [PATCH v4 0/9] Update to zstd-1.4.6
> On Sep 29, 2020, at 11:53 PM, Christoph Hellwig wrote: > > As you keep resend this I keep retelling you that should not do it. > Please provide a proper Linux API, and switch to that. Versioned APIs > have absolutely no business in the Linux kernel. The API is not versioned. We provide a stable ABI for a large section of our API, and the parts that aren’t ABI stable don’t change in semantics, and undergo long deprecation periods before being removed. The change of callers is a one-time change to transition from the existing API in the kernel, which was never upstream's API, to upstream's API. -Nick > On Tue, Sep 29, 2020 at 11:53:09PM -0700, Nick Terrell wrote: >> From: Nick Terrell >> >> This patchset upgrades the zstd library to the latest upstream release. The >> current zstd version in the kernel is a modified version of upstream >> zstd-1.3.1. >> At the time it was integrated, zstd wasn't ready to be used in the kernel >> as-is. >> But, it is now possible to use upstream zstd directly in the kernel. >> >> I have not yet release zstd-1.4.6 upstream. I want the zstd version in the >> kernel >> to match up with a known upstream release, so we know exactly what code is >> running. Whenever this patchset is ready for merge, I will cut a release at >> the >> upstream commit that gets merged. This should not be necessary for future >> releases. >> >> The kernel zstd library is automatically generated from upstream zstd. A >> script >> makes the necessary changes and imports it into the kernel. The changes are: >> >> 1. Replace all libc dependencies with kernel replacements and rewrite >> includes. >> 2. Remove unncessary portability macros like: #if defined(_MSC_VER). >> 3. Use the kernel xxhash instead of bundling it. >> >> This automation gets tested every commit by upstream's continuous >> integration. >> When we cut a new zstd release, we will submit a patch to the kernel to >> update >> the zstd version in the kernel. >> >> I've updated zstd to upstream with one big patch because every commit must >> build, >> so that precludes partial updates. Since the commit is 100% generated, I >> hope the >> review burden is lightened. I considered replaying upstream commits, but >> that is >> not possible because there have been ~3500 upstream commits since the last >> zstd >> import, and the commits don't all build individually. The bulk update >> preserves >> bisectablity because bugs can be bisected to the zstd version update. At that >> point the update can be reverted, and we can work with upstream to find and >> fix >> the bug. After this big switch in how the kernel consumes zstd, future >> patches >> will be smaller, because they will only have one upstream release worth of >> changes each. >> >> This patchset changes the zstd API from a custom kernel API to the upstream >> API. >> I considered wrapping the upstream API with a wrapper that is closer to the >> kernel style guide. Following advise from https://lkml.org/lkml/2020/9/17/814 >> I've chosen to use the upstream API directly, to minimize opportunities to >> introduce bugs, and because using the upstream API directly makes debugging >> and >> communication with upstream easier. >> >> This patchset comes in 3 parts: >> 1. The first 2 patches prepare for the zstd upgrade. The first patch adds a >> compatibility wrapper so zstd can be upgraded without modifying any >> callers. >> The second patch adds an indirection for the lib/decompress_unzstd.c >> including >> of all decompression source files. >> 2. Import zstd-1.4.6. This patch is completely generated from upstream using >> automated tooling. >> 3. Update all callers to the zstd-1.4.6 API then delete the compatibility >> wrapper. >> >> I tested every caller of zstd on x86_64. I tested both after the 1.4.6 >> upgrade >> using the compatibility wrapper, and after the final patch in this series. >> >> I tested kernel and initramfs decompression in i386 and arm. >> >> I ran benchmarks to compare the current zstd in the kernel with zstd-1.4.6. >> I benchmarked on x86_64 using QEMU with KVM enabled on an Intel i9-9900k. >> I found: >> * BtrFS zstd compression at levels 1 and 3 is 5% faster >> * BtrFS zstd decompression+read is 15% faster >> * SquashFS zstd decompression+read is 15% faster >> * F2FS zstd compression+write at level 3 is 8% faster >> * F2FS zstd decompression+read is 20% faster >> * ZRAM decompressi
[PATCH v4 8/9] lib: unzstd: Switch to the zstd-1.4.6 API
From: Nick Terrell Move away from the compatibility wrapper to the zstd-1.4.6 API. This code is functionally equivalent. Signed-off-by: Nick Terrell --- lib/decompress_unzstd.c | 40 ++-- 1 file changed, 14 insertions(+), 26 deletions(-) diff --git a/lib/decompress_unzstd.c b/lib/decompress_unzstd.c index a79f705f236d..d4685df0e120 100644 --- a/lib/decompress_unzstd.c +++ b/lib/decompress_unzstd.c @@ -73,7 +73,8 @@ #include #include -#include +#include +#include /* 128MB is the maximum window size supported by zstd. */ #define ZSTD_WINDOWSIZE_MAX(1 << ZSTD_WINDOWLOG_MAX) @@ -120,9 +121,9 @@ static int INIT decompress_single(const u8 *in_buf, long in_len, u8 *out_buf, long out_len, long *in_pos, void (*error)(char *x)) { - const size_t wksp_size = ZSTD_DCtxWorkspaceBound(); + const size_t wksp_size = ZSTD_estimateDCtxSize(); void *wksp = large_malloc(wksp_size); - ZSTD_DCtx *dctx = ZSTD_initDCtx(wksp, wksp_size); + ZSTD_DCtx *dctx = ZSTD_initStaticDCtx(wksp, wksp_size); int err; size_t ret; @@ -165,7 +166,6 @@ static int INIT __unzstd(unsigned char *in_buf, long in_len, { ZSTD_inBuffer in; ZSTD_outBuffer out; - ZSTD_frameParams params; void *in_allocated = NULL; void *out_allocated = NULL; void *wksp = NULL; @@ -229,36 +229,24 @@ static int INIT __unzstd(unsigned char *in_buf, long in_len, out.size = out_len; /* -* We need to know the window size to allocate the ZSTD_DStream. -* Since we are streaming, we need to allocate a buffer for the sliding -* window. The window size varies from 1 KB to ZSTD_WINDOWSIZE_MAX -* (8 MB), so it is important to use the actual value so as not to -* waste memory when it is smaller. +* Zstd determines the workspace size from the window size written +* into the frame header. This ensures that we use the minimum value +* possible, since the window size varies from 1 KB to ZSTD_WINDOWSIZE_MAX +* (1 GB), so it is very important to use the actual value. */ - ret = ZSTD_getFrameParams(, in.src, in.size); + wksp_size = ZSTD_estimateDStreamSize_fromFrame(in.src, in.size); err = handle_zstd_error(ret, error); if (err) goto out; - if (ret != 0) { - error("ZSTD-compressed data has an incomplete frame header"); - err = -1; - goto out; - } - if (params.windowSize > ZSTD_WINDOWSIZE_MAX) { - error("ZSTD-compressed data has too large a window size"); + wksp = large_malloc(wksp_size); + if (wksp == NULL) { + error("Out of memory while allocating ZSTD_DStream"); err = -1; goto out; } - - /* -* Allocate the ZSTD_DStream now that we know how much memory is -* required. -*/ - wksp_size = ZSTD_DStreamWorkspaceBound(params.windowSize); - wksp = large_malloc(wksp_size); - dstream = ZSTD_initDStream(params.windowSize, wksp, wksp_size); + dstream = ZSTD_initStaticDStream(wksp, wksp_size); if (dstream == NULL) { - error("Out of memory while allocating ZSTD_DStream"); + error("ZSTD_initStaticDStream failed"); err = -1; goto out; } -- 2.28.0
[PATCH v4 9/9] lib: zstd: Remove zstd compatibility wrapper
From: Nick Terrell All callers have been transitioned to the new zstd-1.4.6 API. There are no more callers of the zstd compatibility wrapper, so delete it. Signed-off-by: Nick Terrell --- include/linux/zstd_compat.h | 116 1 file changed, 116 deletions(-) delete mode 100644 include/linux/zstd_compat.h diff --git a/include/linux/zstd_compat.h b/include/linux/zstd_compat.h deleted file mode 100644 index cda9208bf04a.. --- a/include/linux/zstd_compat.h +++ /dev/null @@ -1,116 +0,0 @@ -/* - * Copyright (c) 2016-present, Facebook, Inc. - * All rights reserved. - * - * This source code is licensed under the BSD-style license found in the - * LICENSE file in the root directory of https://github.com/facebook/zstd. - * An additional grant of patent rights can be found in the PATENTS file in the - * same directory. - * - * This program is free software; you can redistribute it and/or modify it under - * the terms of the GNU General Public License version 2 as published by the - * Free Software Foundation. This program is dual-licensed; you may select - * either version 2 of the GNU General Public License ("GPL") or BSD license - * ("BSD"). - */ - -#ifndef ZSTD_COMPAT_H -#define ZSTD_COMPAT_H - -#include - -#if defined(ZSTD_VERSION_NUMBER) && (ZSTD_VERSION_NUMBER >= 10406) -/* - * This header provides backwards compatibility for the zstd-1.4.6 library - * upgrade. This header allows us to upgrade the zstd library version without - * modifying any callers. Then we will migrate callers from the compatibility - * wrapper one at a time until none remain. At which point we will delete this - * header. - * - * It is temporary and will be deleted once the upgrade is complete. - */ - -#include - -static inline size_t ZSTD_CCtxWorkspaceBound(ZSTD_compressionParameters compression_params) -{ -return ZSTD_estimateCCtxSize_usingCParams(compression_params); -} - -static inline size_t ZSTD_CStreamWorkspaceBound(ZSTD_compressionParameters compression_params) -{ -return ZSTD_estimateCStreamSize_usingCParams(compression_params); -} - -static inline size_t ZSTD_DCtxWorkspaceBound(void) -{ -return ZSTD_estimateDCtxSize(); -} - -static inline size_t ZSTD_DStreamWorkspaceBound(unsigned long long window_size) -{ -return ZSTD_estimateDStreamSize(window_size); -} - -static inline ZSTD_CCtx* ZSTD_initCCtx(void* wksp, size_t wksp_size) -{ -if (wksp == NULL) -return NULL; -return ZSTD_initStaticCCtx(wksp, wksp_size); -} - -static inline ZSTD_CStream* ZSTD_initCStream_compat(ZSTD_parameters params, uint64_t pledged_src_size, void* wksp, size_t wksp_size) -{ -ZSTD_CStream* cstream; -size_t ret; - -if (wksp == NULL) -return NULL; - -cstream = ZSTD_initStaticCStream(wksp, wksp_size); -if (cstream == NULL) -return NULL; - -/* 0 means unknown in old API but means 0 in new API */ -if (pledged_src_size == 0) -pledged_src_size = ZSTD_CONTENTSIZE_UNKNOWN; - -ret = ZSTD_initCStream_advanced(cstream, NULL, 0, params, pledged_src_size); -if (ZSTD_isError(ret)) -return NULL; - -return cstream; -} -#define ZSTD_initCStream ZSTD_initCStream_compat - -static inline ZSTD_DCtx* ZSTD_initDCtx(void* wksp, size_t wksp_size) -{ -if (wksp == NULL) -return NULL; -return ZSTD_initStaticDCtx(wksp, wksp_size); -} - -static inline ZSTD_DStream* ZSTD_initDStream_compat(unsigned long long window_size, void* wksp, size_t wksp_size) -{ -if (wksp == NULL) -return NULL; -(void)window_size; -return ZSTD_initStaticDStream(wksp, wksp_size); -} -#define ZSTD_initDStream ZSTD_initDStream_compat - -typedef ZSTD_frameHeader ZSTD_frameParams; - -static inline size_t ZSTD_getFrameParams(ZSTD_frameParams* frame_params, const void* src, size_t src_size) -{ -return ZSTD_getFrameHeader(frame_params, src, src_size); -} - -static inline size_t ZSTD_compressCCtx_compat(ZSTD_CCtx* cctx, void* dst, size_t dst_capacity, const void* src, size_t src_size, ZSTD_parameters params) -{ -return ZSTD_compress_advanced(cctx, dst, dst_capacity, src, src_size, NULL, 0, params); -} -#define ZSTD_compressCCtx ZSTD_compressCCtx_compat - -#endif /* ZSTD_VERSION_NUMBER >= 10406 */ -#endif /* ZSTD_COMPAT_H */ -- 2.28.0
[PATCH v4 5/9] btrfs: zstd: Switch to the zstd-1.4.6 API
From: Nick Terrell Move away from the compatibility wrapper to the zstd-1.4.6 API. This code is functionally equivalent. Signed-off-by: Nick Terrell --- fs/btrfs/zstd.c | 48 1 file changed, 28 insertions(+), 20 deletions(-) diff --git a/fs/btrfs/zstd.c b/fs/btrfs/zstd.c index a7367ff573d4..6b466e090cd7 100644 --- a/fs/btrfs/zstd.c +++ b/fs/btrfs/zstd.c @@ -16,7 +16,7 @@ #include #include #include -#include +#include #include "misc.h" #include "compression.h" #include "ctree.h" @@ -159,8 +159,8 @@ static void zstd_calc_ws_mem_sizes(void) zstd_get_btrfs_parameters(level, ZSTD_BTRFS_MAX_INPUT); size_t level_size = max_t(size_t, - ZSTD_CStreamWorkspaceBound(params.cParams), - ZSTD_DStreamWorkspaceBound(ZSTD_BTRFS_MAX_INPUT)); + ZSTD_estimateCStreamSize_usingCParams(params.cParams), + ZSTD_estimateDStreamSize(ZSTD_BTRFS_MAX_INPUT)); max_size = max_t(size_t, max_size, level_size); zstd_ws_mem_sizes[level - 1] = max_size; @@ -389,13 +389,23 @@ int zstd_compress_pages(struct list_head *ws, struct address_space *mapping, *total_in = 0; /* Initialize the stream */ - stream = ZSTD_initCStream(params, len, workspace->mem, - workspace->size); + stream = ZSTD_initStaticCStream(workspace->mem, workspace->size); if (!stream) { - pr_warn("BTRFS: ZSTD_initCStream failed\n"); + pr_warn("BTRFS: ZSTD_initStaticCStream failed\n"); ret = -EIO; goto out; } + { + size_t ret2; + + ret2 = ZSTD_initCStream_advanced(stream, NULL, 0, params, len); + if (ZSTD_isError(ret2)) { + pr_warn("BTRFS: ZSTD_initCStream_advanced returned %s\n", + ZSTD_getErrorName(ret2)); + ret = -EIO; + goto out; + } + } /* map in the first page of input data */ in_page = find_get_page(mapping, start >> PAGE_SHIFT); @@ -421,8 +431,8 @@ int zstd_compress_pages(struct list_head *ws, struct address_space *mapping, ret2 = ZSTD_compressStream(stream, >out_buf, >in_buf); if (ZSTD_isError(ret2)) { - pr_debug("BTRFS: ZSTD_compressStream returned %d\n", - ZSTD_getErrorCode(ret2)); + pr_debug("BTRFS: ZSTD_compressStream returned %s\n", + ZSTD_getErrorName(ret2)); ret = -EIO; goto out; } @@ -489,8 +499,8 @@ int zstd_compress_pages(struct list_head *ws, struct address_space *mapping, ret2 = ZSTD_endStream(stream, >out_buf); if (ZSTD_isError(ret2)) { - pr_debug("BTRFS: ZSTD_endStream returned %d\n", - ZSTD_getErrorCode(ret2)); + pr_debug("BTRFS: ZSTD_endStream returned %s\n", + ZSTD_getErrorName(ret2)); ret = -EIO; goto out; } @@ -557,10 +567,9 @@ int zstd_decompress_bio(struct list_head *ws, struct compressed_bio *cb) unsigned long buf_start; unsigned long total_out = 0; - stream = ZSTD_initDStream( - ZSTD_BTRFS_MAX_INPUT, workspace->mem, workspace->size); + stream = ZSTD_initStaticDStream(workspace->mem, workspace->size); if (!stream) { - pr_debug("BTRFS: ZSTD_initDStream failed\n"); + pr_debug("BTRFS: ZSTD_initStaticDStream failed\n"); ret = -EIO; goto done; } @@ -579,8 +588,8 @@ int zstd_decompress_bio(struct list_head *ws, struct compressed_bio *cb) ret2 = ZSTD_decompressStream(stream, >out_buf, >in_buf); if (ZSTD_isError(ret2)) { - pr_debug("BTRFS: ZSTD_decompressStream returned %d\n", - ZSTD_getErrorCode(ret2)); + pr_debug("BTRFS: ZSTD_decompressStream returned %s\n", + ZSTD_getErrorName(ret2)); ret = -EIO; goto done; } @@ -633,10 +642,9 @@ int zstd_decompress(struct list_head *ws, unsigned char *data_in, unsigned long pg_offset = 0; char *kaddr; -
[PATCH v4 7/9] squashfs: zstd: Switch to the zstd-1.4.6 API
From: Nick Terrell Move away from the compatibility wrapper to the zstd-1.4.6 API. This code is functionally equivalent. Signed-off-by: Nick Terrell --- fs/squashfs/zstd_wrapper.c | 9 - 1 file changed, 4 insertions(+), 5 deletions(-) diff --git a/fs/squashfs/zstd_wrapper.c b/fs/squashfs/zstd_wrapper.c index f8c512a6204e..add582409866 100644 --- a/fs/squashfs/zstd_wrapper.c +++ b/fs/squashfs/zstd_wrapper.c @@ -11,7 +11,7 @@ #include #include #include -#include +#include #include #include "squashfs_fs.h" @@ -34,7 +34,7 @@ static void *zstd_init(struct squashfs_sb_info *msblk, void *buff) goto failed; wksp->window_size = max_t(size_t, msblk->block_size, SQUASHFS_METADATA_SIZE); - wksp->mem_size = ZSTD_DStreamWorkspaceBound(wksp->window_size); + wksp->mem_size = ZSTD_estimateDStreamSize(wksp->window_size); wksp->mem = vmalloc(wksp->mem_size); if (wksp->mem == NULL) goto failed; @@ -71,7 +71,7 @@ static int zstd_uncompress(struct squashfs_sb_info *msblk, void *strm, struct bvec_iter_all iter_all = {}; struct bio_vec *bvec = bvec_init_iter_all(_all); - stream = ZSTD_initDStream(wksp->window_size, wksp->mem, wksp->mem_size); + stream = ZSTD_initStaticDStream(wksp->mem, wksp->mem_size); if (!stream) { ERROR("Failed to initialize zstd decompressor\n"); @@ -122,8 +122,7 @@ static int zstd_uncompress(struct squashfs_sb_info *msblk, void *strm, break; if (ZSTD_isError(zstd_err)) { - ERROR("zstd decompression error: %d\n", - (int)ZSTD_getErrorCode(zstd_err)); + ERROR("zstd decompression error: %s\n", ZSTD_getErrorName(zstd_err)); error = -EIO; break; } -- 2.28.0
[PATCH v4 6/9] f2fs: zstd: Switch to the zstd-1.4.6 API
From: Nick Terrell Move away from the compatibility wrapper to the zstd-1.4.6 API. This code is more efficient because it uses the single-pass API instead of the streaming API. The streaming API is not necessary because the whole input and output buffers are available. This saves memory because we don't need to allocate a buffer for the window. It is also more efficient because it saves unnecessary memcpy calls. Compression memory increases from 168 KB to 204 KB because upstream uses slightly more memory. Decompression memory decreases from 1.4 MB to 158 KB. Signed-off-by: Nick Terrell --- fs/f2fs/compress.c | 102 + 1 file changed, 38 insertions(+), 64 deletions(-) diff --git a/fs/f2fs/compress.c b/fs/f2fs/compress.c index e056f3a2b404..b79efce81651 100644 --- a/fs/f2fs/compress.c +++ b/fs/f2fs/compress.c @@ -11,7 +11,8 @@ #include #include #include -#include +#include +#include #include "f2fs.h" #include "node.h" @@ -298,21 +299,21 @@ static const struct f2fs_compress_ops f2fs_lz4_ops = { static int zstd_init_compress_ctx(struct compress_ctx *cc) { ZSTD_parameters params; - ZSTD_CStream *stream; + ZSTD_CCtx *ctx; void *workspace; unsigned int workspace_size; params = ZSTD_getParams(F2FS_ZSTD_DEFAULT_CLEVEL, cc->rlen, 0); - workspace_size = ZSTD_CStreamWorkspaceBound(params.cParams); + workspace_size = ZSTD_estimateCCtxSize_usingCParams(params.cParams); workspace = f2fs_kvmalloc(F2FS_I_SB(cc->inode), workspace_size, GFP_NOFS); if (!workspace) return -ENOMEM; - stream = ZSTD_initCStream(params, 0, workspace, workspace_size); - if (!stream) { - printk_ratelimited("%sF2FS-fs (%s): %s ZSTD_initCStream failed\n", + ctx = ZSTD_initStaticCCtx(workspace, workspace_size); + if (!ctx) { + printk_ratelimited("%sF2FS-fs (%s): %s ZSTD_inittaticCStream failed\n", KERN_ERR, F2FS_I_SB(cc->inode)->sb->s_id, __func__); kvfree(workspace); @@ -320,7 +321,7 @@ static int zstd_init_compress_ctx(struct compress_ctx *cc) } cc->private = workspace; - cc->private2 = stream; + cc->private2 = ctx; cc->clen = cc->rlen - PAGE_SIZE - COMPRESS_HEADER_SIZE; return 0; @@ -335,65 +336,48 @@ static void zstd_destroy_compress_ctx(struct compress_ctx *cc) static int zstd_compress_pages(struct compress_ctx *cc) { - ZSTD_CStream *stream = cc->private2; - ZSTD_inBuffer inbuf; - ZSTD_outBuffer outbuf; - int src_size = cc->rlen; - int dst_size = src_size - PAGE_SIZE - COMPRESS_HEADER_SIZE; - int ret; - - inbuf.pos = 0; - inbuf.src = cc->rbuf; - inbuf.size = src_size; - - outbuf.pos = 0; - outbuf.dst = cc->cbuf->cdata; - outbuf.size = dst_size; - - ret = ZSTD_compressStream(stream, , ); - if (ZSTD_isError(ret)) { - printk_ratelimited("%sF2FS-fs (%s): %s ZSTD_compressStream failed, ret: %d\n", - KERN_ERR, F2FS_I_SB(cc->inode)->sb->s_id, - __func__, ZSTD_getErrorCode(ret)); - return -EIO; - } - - ret = ZSTD_endStream(stream, ); + ZSTD_CCtx *ctx = cc->private2; + const size_t src_size = cc->rlen; + const size_t dst_size = src_size - PAGE_SIZE - COMPRESS_HEADER_SIZE; + ZSTD_parameters params = ZSTD_getParams(F2FS_ZSTD_DEFAULT_CLEVEL, src_size, 0); + size_t ret; + + ret = ZSTD_compress_advanced( + ctx, cc->cbuf->cdata, dst_size, cc->rbuf, src_size, NULL, 0, params); if (ZSTD_isError(ret)) { - printk_ratelimited("%sF2FS-fs (%s): %s ZSTD_endStream returned %d\n", + /* +* there is compressed data remained in intermediate buffer due to +* no more space in cbuf.cdata +*/ + if (ZSTD_getErrorCode(ret) == ZSTD_error_dstSize_tooSmall) + return -EAGAIN; + /* other compression errors return -EIO */ + printk_ratelimited("%sF2FS-fs (%s): %s ZSTD_compress_advanced failed, err: %s\n", KERN_ERR, F2FS_I_SB(cc->inode)->sb->s_id, - __func__, ZSTD_getErrorCode(ret)); + __func__, ZSTD_getErrorName(ret)); return -EIO; } - /* -* there is compressed data remained in intermediate buffer due to -* no more space in cbuf.cdata -*/ - if (ret) - return -EAGAIN; - - cc->clen = outbuf.
[PATCH v4 4/9] crypto: zstd: Switch to zstd-1.4.6 API
From: Nick Terrell Move away from the compatibility wrapper to the zstd-1.4.6 API. This code is functionally equivalent. Signed-off-by: Nick Terrell --- crypto/zstd.c | 24 +++- 1 file changed, 11 insertions(+), 13 deletions(-) diff --git a/crypto/zstd.c b/crypto/zstd.c index dcda3cad3b5c..767fe2fbe009 100644 --- a/crypto/zstd.c +++ b/crypto/zstd.c @@ -11,7 +11,7 @@ #include #include #include -#include +#include #include @@ -24,16 +24,15 @@ struct zstd_ctx { void *dwksp; }; -static ZSTD_parameters zstd_params(void) -{ - return ZSTD_getParams(ZSTD_DEF_LEVEL, 0, 0); -} - static int zstd_comp_init(struct zstd_ctx *ctx) { int ret = 0; - const ZSTD_parameters params = zstd_params(); - const size_t wksp_size = ZSTD_CCtxWorkspaceBound(params.cParams); + const size_t wksp_size = ZSTD_estimateCCtxSize(ZSTD_DEF_LEVEL); + + if (ZSTD_isError(wksp_size)) { + ret = -EINVAL; + goto out_free; + } ctx->cwksp = vzalloc(wksp_size); if (!ctx->cwksp) { @@ -41,7 +40,7 @@ static int zstd_comp_init(struct zstd_ctx *ctx) goto out; } - ctx->cctx = ZSTD_initCCtx(ctx->cwksp, wksp_size); + ctx->cctx = ZSTD_initStaticCCtx(ctx->cwksp, wksp_size); if (!ctx->cctx) { ret = -EINVAL; goto out_free; @@ -56,7 +55,7 @@ static int zstd_comp_init(struct zstd_ctx *ctx) static int zstd_decomp_init(struct zstd_ctx *ctx) { int ret = 0; - const size_t wksp_size = ZSTD_DCtxWorkspaceBound(); + const size_t wksp_size = ZSTD_estimateDCtxSize(); ctx->dwksp = vzalloc(wksp_size); if (!ctx->dwksp) { @@ -64,7 +63,7 @@ static int zstd_decomp_init(struct zstd_ctx *ctx) goto out; } - ctx->dctx = ZSTD_initDCtx(ctx->dwksp, wksp_size); + ctx->dctx = ZSTD_initStaticDCtx(ctx->dwksp, wksp_size); if (!ctx->dctx) { ret = -EINVAL; goto out_free; @@ -152,9 +151,8 @@ static int __zstd_compress(const u8 *src, unsigned int slen, { size_t out_len; struct zstd_ctx *zctx = ctx; - const ZSTD_parameters params = zstd_params(); - out_len = ZSTD_compressCCtx(zctx->cctx, dst, *dlen, src, slen, params); + out_len = ZSTD_compressCCtx(zctx->cctx, dst, *dlen, src, slen, ZSTD_DEF_LEVEL); if (ZSTD_isError(out_len)) return -EINVAL; *dlen = out_len; -- 2.28.0
[PATCH v4 1/9] lib: zstd: Add zstd compatibility wrapper
From: Nick Terrell Adds zstd_compat.h which provides the necessary functions from the current zstd.h API. It is only active for zstd versions 1.4.6 and newer. That means it is disabled currently, but will become active when a later patch in this series updates the zstd library in the kernel to 1.4.6. This header allows the zstd upgrade to 1.4.6 without changing any callers, since they all include zstd through the compatibility wrapper. Later patches in this series transition each caller away from the compatibility wrapper. After all the callers have been transitioned away from the compatibility wrapper, the final patch in this series deletes it. Signed-off-by: Nick Terrell --- crypto/zstd.c | 2 +- fs/btrfs/zstd.c | 2 +- fs/f2fs/compress.c | 2 +- fs/squashfs/zstd_wrapper.c | 2 +- include/linux/zstd_compat.h | 116 lib/decompress_unzstd.c | 2 +- 6 files changed, 121 insertions(+), 5 deletions(-) create mode 100644 include/linux/zstd_compat.h diff --git a/crypto/zstd.c b/crypto/zstd.c index 1a3309f066f7..dcda3cad3b5c 100644 --- a/crypto/zstd.c +++ b/crypto/zstd.c @@ -11,7 +11,7 @@ #include #include #include -#include +#include #include diff --git a/fs/btrfs/zstd.c b/fs/btrfs/zstd.c index 9a4871636c6c..a7367ff573d4 100644 --- a/fs/btrfs/zstd.c +++ b/fs/btrfs/zstd.c @@ -16,7 +16,7 @@ #include #include #include -#include +#include #include "misc.h" #include "compression.h" #include "ctree.h" diff --git a/fs/f2fs/compress.c b/fs/f2fs/compress.c index 1dfb126a0cb2..e056f3a2b404 100644 --- a/fs/f2fs/compress.c +++ b/fs/f2fs/compress.c @@ -11,7 +11,7 @@ #include #include #include -#include +#include #include "f2fs.h" #include "node.h" diff --git a/fs/squashfs/zstd_wrapper.c b/fs/squashfs/zstd_wrapper.c index b7cb1faa652d..f8c512a6204e 100644 --- a/fs/squashfs/zstd_wrapper.c +++ b/fs/squashfs/zstd_wrapper.c @@ -11,7 +11,7 @@ #include #include #include -#include +#include #include #include "squashfs_fs.h" diff --git a/include/linux/zstd_compat.h b/include/linux/zstd_compat.h new file mode 100644 index ..cda9208bf04a --- /dev/null +++ b/include/linux/zstd_compat.h @@ -0,0 +1,116 @@ +/* + * Copyright (c) 2016-present, Facebook, Inc. + * All rights reserved. + * + * This source code is licensed under the BSD-style license found in the + * LICENSE file in the root directory of https://github.com/facebook/zstd. + * An additional grant of patent rights can be found in the PATENTS file in the + * same directory. + * + * This program is free software; you can redistribute it and/or modify it under + * the terms of the GNU General Public License version 2 as published by the + * Free Software Foundation. This program is dual-licensed; you may select + * either version 2 of the GNU General Public License ("GPL") or BSD license + * ("BSD"). + */ + +#ifndef ZSTD_COMPAT_H +#define ZSTD_COMPAT_H + +#include + +#if defined(ZSTD_VERSION_NUMBER) && (ZSTD_VERSION_NUMBER >= 10406) +/* + * This header provides backwards compatibility for the zstd-1.4.6 library + * upgrade. This header allows us to upgrade the zstd library version without + * modifying any callers. Then we will migrate callers from the compatibility + * wrapper one at a time until none remain. At which point we will delete this + * header. + * + * It is temporary and will be deleted once the upgrade is complete. + */ + +#include + +static inline size_t ZSTD_CCtxWorkspaceBound(ZSTD_compressionParameters compression_params) +{ +return ZSTD_estimateCCtxSize_usingCParams(compression_params); +} + +static inline size_t ZSTD_CStreamWorkspaceBound(ZSTD_compressionParameters compression_params) +{ +return ZSTD_estimateCStreamSize_usingCParams(compression_params); +} + +static inline size_t ZSTD_DCtxWorkspaceBound(void) +{ +return ZSTD_estimateDCtxSize(); +} + +static inline size_t ZSTD_DStreamWorkspaceBound(unsigned long long window_size) +{ +return ZSTD_estimateDStreamSize(window_size); +} + +static inline ZSTD_CCtx* ZSTD_initCCtx(void* wksp, size_t wksp_size) +{ +if (wksp == NULL) +return NULL; +return ZSTD_initStaticCCtx(wksp, wksp_size); +} + +static inline ZSTD_CStream* ZSTD_initCStream_compat(ZSTD_parameters params, uint64_t pledged_src_size, void* wksp, size_t wksp_size) +{ +ZSTD_CStream* cstream; +size_t ret; + +if (wksp == NULL) +return NULL; + +cstream = ZSTD_initStaticCStream(wksp, wksp_size); +if (cstream == NULL) +return NULL; + +/* 0 means unknown in old API but means 0 in new API */ +if (pledged_src_size == 0) +pledged_src_size = ZSTD_CONTENTSIZE_UNKNOWN; + +ret = ZSTD_initCStream_advanced(cstream, NULL, 0, params, pledged_src_size); +if (ZSTD_isError(ret)) +return NULL; + +return cstream; +} +#define ZS
[PATCH v4 0/9] Update to zstd-1.4.6
From: Nick Terrell This patchset upgrades the zstd library to the latest upstream release. The current zstd version in the kernel is a modified version of upstream zstd-1.3.1. At the time it was integrated, zstd wasn't ready to be used in the kernel as-is. But, it is now possible to use upstream zstd directly in the kernel. I have not yet release zstd-1.4.6 upstream. I want the zstd version in the kernel to match up with a known upstream release, so we know exactly what code is running. Whenever this patchset is ready for merge, I will cut a release at the upstream commit that gets merged. This should not be necessary for future releases. The kernel zstd library is automatically generated from upstream zstd. A script makes the necessary changes and imports it into the kernel. The changes are: 1. Replace all libc dependencies with kernel replacements and rewrite includes. 2. Remove unncessary portability macros like: #if defined(_MSC_VER). 3. Use the kernel xxhash instead of bundling it. This automation gets tested every commit by upstream's continuous integration. When we cut a new zstd release, we will submit a patch to the kernel to update the zstd version in the kernel. I've updated zstd to upstream with one big patch because every commit must build, so that precludes partial updates. Since the commit is 100% generated, I hope the review burden is lightened. I considered replaying upstream commits, but that is not possible because there have been ~3500 upstream commits since the last zstd import, and the commits don't all build individually. The bulk update preserves bisectablity because bugs can be bisected to the zstd version update. At that point the update can be reverted, and we can work with upstream to find and fix the bug. After this big switch in how the kernel consumes zstd, future patches will be smaller, because they will only have one upstream release worth of changes each. This patchset changes the zstd API from a custom kernel API to the upstream API. I considered wrapping the upstream API with a wrapper that is closer to the kernel style guide. Following advise from https://lkml.org/lkml/2020/9/17/814 I've chosen to use the upstream API directly, to minimize opportunities to introduce bugs, and because using the upstream API directly makes debugging and communication with upstream easier. This patchset comes in 3 parts: 1. The first 2 patches prepare for the zstd upgrade. The first patch adds a compatibility wrapper so zstd can be upgraded without modifying any callers. The second patch adds an indirection for the lib/decompress_unzstd.c including of all decompression source files. 2. Import zstd-1.4.6. This patch is completely generated from upstream using automated tooling. 3. Update all callers to the zstd-1.4.6 API then delete the compatibility wrapper. I tested every caller of zstd on x86_64. I tested both after the 1.4.6 upgrade using the compatibility wrapper, and after the final patch in this series. I tested kernel and initramfs decompression in i386 and arm. I ran benchmarks to compare the current zstd in the kernel with zstd-1.4.6. I benchmarked on x86_64 using QEMU with KVM enabled on an Intel i9-9900k. I found: * BtrFS zstd compression at levels 1 and 3 is 5% faster * BtrFS zstd decompression+read is 15% faster * SquashFS zstd decompression+read is 15% faster * F2FS zstd compression+write at level 3 is 8% faster * F2FS zstd decompression+read is 20% faster * ZRAM decompression+read is 30% faster * Kernel zstd decompression is 35% faster * Initramfs zstd decompression+build is 5% faster The latest zstd also offers bug fixes and a 1 KB reduction in stack uage during compression. For example the recent problem with large kernel decompression has been fixed upstream for over 2 years https://lkml.org/lkml/2020/9/29/27. Please let me know if there is anything that I can do to ease the way for these patches. I think it is important because it gets large performance improvements, contains bug fixes, and is switching to a more maintainable model of consuming upstream zstd directly, making it easy to keep up to date. Best, Nick Terrell v1 -> v2: * Successfully tested F2FS with help from Chao Yu to fix my test. * (1/9) Fix ZSTD_initCStream() wrapper to handle pledged_src_size=0 means unknown. This fixes F2FS with the zstd-1.4.6 compatibility wrapper, exposed by the test. v2 -> v3: * (3/9) Silence warnings by Kernel Test Robot: https://github.com/facebook/zstd/pull/2324 Stack size warnings remain, but these aren't new, and the functions it warns on are either unused or not in the maximum stack path. This patchset reduces zstd compression stack usage by 1 KB overall. I've gotten the low hanging fruit, and more stack reduction would require significant changes that have the potential to introduce new bugs. However, I do hope to continue to reduce zstd stack usage in future versions. v3 -> v4: * (3/9) Fix errors and
[PATCH v4 2/9] lib: zstd: Add decompress_sources.h for decompress_unzstd
From: Nick Terrell Adds decompress_sources.h which includes every .c file necessary for zstd decompression. This is used in decompress_unzstd.c so the internal structure of the library isn't exposed. This allows us to upgrade the zstd library version without modifying any callers. Instead we just need to update decompress_sources.h. Signed-off-by: Nick Terrell --- lib/decompress_unzstd.c | 6 +- lib/zstd/decompress_sources.h | 14 ++ 2 files changed, 15 insertions(+), 5 deletions(-) create mode 100644 lib/zstd/decompress_sources.h diff --git a/lib/decompress_unzstd.c b/lib/decompress_unzstd.c index dbc290af26b4..a79f705f236d 100644 --- a/lib/decompress_unzstd.c +++ b/lib/decompress_unzstd.c @@ -68,11 +68,7 @@ #ifdef STATIC # define UNZSTD_PREBOOT # include "xxhash.c" -# include "zstd/entropy_common.c" -# include "zstd/fse_decompress.c" -# include "zstd/huf_decompress.c" -# include "zstd/zstd_common.c" -# include "zstd/decompress.c" +# include "zstd/decompress_sources.h" #endif #include diff --git a/lib/zstd/decompress_sources.h b/lib/zstd/decompress_sources.h new file mode 100644 index ..ccb4960ea0cd --- /dev/null +++ b/lib/zstd/decompress_sources.h @@ -0,0 +1,14 @@ +// SPDX-License-Identifier: GPL-2.0-only + +/* + * This file includes every .c file needed for decompression. + * It is used by lib/decompress_unzstd.c to include the decompression + * source into the translation-unit, so it can be used for kernel + * decompression. + */ + +#include "entropy_common.c" +#include "fse_decompress.c" +#include "huf_decompress.c" +#include "zstd_common.c" +#include "decompress.c" -- 2.28.0
Re: PROBLEM: zstd bzImage decompression fails for some x86_32 config on 5.9-rc1
> On Sep 28, 2020, at 11:02 AM, Nick Terrell wrote: > > > >> On Sep 28, 2020, at 1:55 AM, Feng Tang wrote: >> >> Hi Nick, >> >> 0day has found some kernel decomprssion failure case since 5.9-rc1 (X86_32 >> build), and it could be related with ZSTD code, though initially we bisected >> to some other commits. >> >> The error messages are: >> >> early console in setup code >> Wrong EFI loader signature. >> early console in extract_kernel >> input_data: 0x046f50b4 >> input_len: 0x01ebbeb6 >> output: 0x0100 >> output_len: 0x04fc535c >> kernel_total_size: 0x055f5000 >> needed_size: 0x055f5000 >> >> Decompressing Linux... >> >> ZSTD-compressed data is corrupt >> >> This could be reproduced by compiling the kernel with attached config, >> and use QEMU to boot it. >> >> We suspect it could be related with the kernel size, as we only see >> it on big kernel, and some more info are: >> >> * If we remove a lot of kernel config to build a much smaller kernel, >> it will boot fine >> >> * If we change the zstd algorithm from zstd22 to zstd19, the kernel will >> boot fine with below patch >> >> diff --git a/arch/x86/boot/compressed/Makefile >> b/arch/x86/boot/compressed/Makefile >> index 3962f59..8fe71ba 100644 >> --- a/arch/x86/boot/compressed/Makefile >> +++ b/arch/x86/boot/compressed/Makefile >> @@ -147,7 +147,7 @@ $(obj)/vmlinux.bin.lzo: $(vmlinux.bin.all-y) FORCE >> $(obj)/vmlinux.bin.zst: $(vmlinux.bin.all-y) FORCE >> - $(call if_changed,zstd22) >> + $(call if_changed,zstd) >> >> >> Please let me know if you need more info, and sorry for the late report >> as we just tracked down to this point. > > Thanks for the report, I will look into it today. CC: Petr Malat I’ve successfully reproduced, and found the issue. It turns out that this patch [0] from Petr Malat fixes the issue. As I mentioned in that thread, his fix corresponds to this upstream commit [1]. Can we get Petr's patch merged into v5.9? This bug only happens when the window size is > 8 MB. A non-kernel workaround would be to compress the kernel level 19 instead of level 22, which uses an 8 MB window size, instead of a 128 MB window size. The reason it only shows up for large kernels, is that the code is only buggy when an offset > 8 MB is used, so a kernel <= 8 MB can't trigger the bug. Best, Nick [0] https://lkml.org/lkml/2020/9/14/94 [1] https://github.com/facebook/zstd/commit/8a5c0c98ae5a7884694589d7a69bc99011add94d > Best, > Nick > >> Thanks, >> Feng >> >> >> >>
Re: PROBLEM: zstd bzImage decompression fails for some x86_32 config on 5.9-rc1
> On Sep 28, 2020, at 1:55 AM, Feng Tang wrote: > > Hi Nick, > > 0day has found some kernel decomprssion failure case since 5.9-rc1 (X86_32 > build), and it could be related with ZSTD code, though initially we bisected > to some other commits. > > The error messages are: > > early console in setup code > Wrong EFI loader signature. > early console in extract_kernel > input_data: 0x046f50b4 > input_len: 0x01ebbeb6 > output: 0x0100 > output_len: 0x04fc535c > kernel_total_size: 0x055f5000 > needed_size: 0x055f5000 > > Decompressing Linux... > > ZSTD-compressed data is corrupt > > This could be reproduced by compiling the kernel with attached config, > and use QEMU to boot it. > > We suspect it could be related with the kernel size, as we only see > it on big kernel, and some more info are: > > * If we remove a lot of kernel config to build a much smaller kernel, > it will boot fine > > * If we change the zstd algorithm from zstd22 to zstd19, the kernel will > boot fine with below patch > > diff --git a/arch/x86/boot/compressed/Makefile > b/arch/x86/boot/compressed/Makefile > index 3962f59..8fe71ba 100644 > --- a/arch/x86/boot/compressed/Makefile > +++ b/arch/x86/boot/compressed/Makefile > @@ -147,7 +147,7 @@ $(obj)/vmlinux.bin.lzo: $(vmlinux.bin.all-y) FORCE >$(obj)/vmlinux.bin.zst: $(vmlinux.bin.all-y) FORCE > - $(call if_changed,zstd22) > + $(call if_changed,zstd) > > > Please let me know if you need more info, and sorry for the late report > as we just tracked down to this point. Thanks for the report, I will look into it today. Best, Nick > Thanks, > Feng > > > >
Re: [PATCH v3 3/9] lib: zstd: Upgrade to latest upstream zstd version 1.4.6
On Wed, Sep 23, 2020 at 7:28 PM kernel test robot wrote: > > Hi Nick, > > Thank you for the patch! Yet something to improve: > > [auto build test ERROR on kdave/for-next] > [also build test ERROR on f2fs/dev-test linus/master v5.9-rc6 next-20200923] > [cannot apply to cryptodev/master crypto/master] > [If your patch is applied to the wrong git tree, kindly drop us a note. > And when submitting patch, we suggest to use '--base' as documented in > https://git-scm.com/docs/git-format-patch] > > url: > https://github.com/0day-ci/linux/commits/Nick-Terrell/Update-to-zstd-1-4-6/20200924-064102 > base: https://git.kernel.org/pub/scm/linux/kernel/git/kdave/linux.git > for-next > config: h8300-randconfig-p002-20200923 (attached as .config) > compiler: h8300-linux-gcc (GCC) 9.3.0 > reproduce (this is a W=1 build): > wget > https://raw.githubusercontent.com/intel/lkp-tests/master/sbin/make.cross -O > ~/bin/make.cross > chmod +x ~/bin/make.cross > # save the attached .config to linux build tree > COMPILER_INSTALL_PATH=$HOME/0day COMPILER=gcc-9.3.0 make.cross > ARCH=h8300 > > If you fix the issue, kindly add following tag as appropriate > Reported-by: kernel test robot > > All errors (new ones prefixed by >>): > >h8300-linux-ld: lib/zstd/common/entropy_common.o: in function `MEM_swap32': > >> lib/zstd/common/mem.h:179: undefined reference to `__bswapsi2' > >> h8300-linux-ld: lib/zstd/common/mem.h:179: undefined reference to > >> `__bswapsi2' > >> h8300-linux-ld: lib/zstd/common/mem.h:179: undefined reference to > >> `__bswapsi2' > >> h8300-linux-ld: lib/zstd/common/mem.h:179: undefined reference to > >> `__bswapsi2' >h8300-linux-ld: lib/zstd/common/fse_decompress.o: in function `MEM_swap32': > >> lib/zstd/common/mem.h:179: undefined reference to `__bswapsi2' >h8300-linux-ld: > lib/zstd/common/fse_decompress.o:lib/zstd/common/mem.h:179: more undefined > references to `__bswapsi2' follow >h8300-linux-ld: lib/zstd/compress/zstd_compress.o: in function > `MEM_swap64': > >> lib/zstd/compress/../common/mem.h:192: undefined reference to `__bswapdi2' >h8300-linux-ld: lib/zstd/compress/zstd_compress.o: in function > `MEM_swap32': > >> lib/zstd/compress/../common/mem.h:179: undefined reference to `__bswapsi2' > >> h8300-linux-ld: lib/zstd/compress/../common/mem.h:179: undefined reference > >> to `__bswapsi2' > >> h8300-linux-ld: lib/zstd/compress/../common/mem.h:179: undefined reference > >> to `__bswapsi2' > >> h8300-linux-ld: lib/zstd/compress/../common/mem.h:179: undefined reference > >> to `__bswapsi2' > >> h8300-linux-ld: lib/zstd/compress/../common/mem.h:179: undefined reference > >> to `__bswapsi2' >h8300-linux-ld: > lib/zstd/compress/zstd_compress.o:lib/zstd/compress/../common/mem.h:179: more > undefined references to `__bswapsi2' follow >h8300-linux-ld: lib/zstd/compress/zstd_double_fast.o: in function > `MEM_swap64': > >> lib/zstd/compress/../common/mem.h:192: undefined reference to `__bswapdi2' > >> h8300-linux-ld: lib/zstd/compress/../common/mem.h:192: undefined reference > >> to `__bswapdi2' > >> h8300-linux-ld: lib/zstd/compress/../common/mem.h:192: undefined reference > >> to `__bswapdi2' > >> h8300-linux-ld: lib/zstd/compress/../common/mem.h:192: undefined reference > >> to `__bswapdi2' > >> h8300-linux-ld: lib/zstd/compress/../common/mem.h:192: undefined reference > >> to `__bswapdi2' >h8300-linux-ld: > lib/zstd/compress/zstd_double_fast.o:lib/zstd/compress/../common/mem.h:192: > more undefined references to `__bswapdi2' follow >h8300-linux-ld: lib/zstd/compress/zstd_opt.o: in function `MEM_swap32': > >> lib/zstd/compress/../common/mem.h:179: undefined reference to `__bswapsi2' > >> h8300-linux-ld: lib/zstd/compress/../common/mem.h:179: undefined reference > >> to `__bswapsi2' >h8300-linux-ld: lib/zstd/compress/zstd_opt.o: in function `MEM_swap64': > >> lib/zstd/compress/../common/mem.h:192: undefined reference to `__bswapdi2' > >> h8300-linux-ld: lib/zstd/compress/../common/mem.h:192: undefined reference > >> to `__bswapdi2' >h8300-linux-ld: lib/zstd/compress/../common/mem.h:192: undefined reference > to `__bswapdi2' >h8300-linux-ld: lib/zstd/compress/../common/mem.h:192: undefined reference > to `__bswapdi2' >h8300-linux-ld: lib/zstd/compress/../common/mem.h:192: undefined reference > to `__bswapdi2' >h8300-linux-ld: > lib/zstd/compress/zstd_opt.o:lib/zstd/compress/../common/mem.h:192: more > undefin
[PATCH v3 7/9] squashfs: zstd: Switch to the zstd-1.4.6 API
From: Nick Terrell Move away from the compatibility wrapper to the zstd-1.4.6 API. This code is functionally equivalent. Signed-off-by: Nick Terrell --- fs/squashfs/zstd_wrapper.c | 9 - 1 file changed, 4 insertions(+), 5 deletions(-) diff --git a/fs/squashfs/zstd_wrapper.c b/fs/squashfs/zstd_wrapper.c index f8c512a6204e..add582409866 100644 --- a/fs/squashfs/zstd_wrapper.c +++ b/fs/squashfs/zstd_wrapper.c @@ -11,7 +11,7 @@ #include #include #include -#include +#include #include #include "squashfs_fs.h" @@ -34,7 +34,7 @@ static void *zstd_init(struct squashfs_sb_info *msblk, void *buff) goto failed; wksp->window_size = max_t(size_t, msblk->block_size, SQUASHFS_METADATA_SIZE); - wksp->mem_size = ZSTD_DStreamWorkspaceBound(wksp->window_size); + wksp->mem_size = ZSTD_estimateDStreamSize(wksp->window_size); wksp->mem = vmalloc(wksp->mem_size); if (wksp->mem == NULL) goto failed; @@ -71,7 +71,7 @@ static int zstd_uncompress(struct squashfs_sb_info *msblk, void *strm, struct bvec_iter_all iter_all = {}; struct bio_vec *bvec = bvec_init_iter_all(_all); - stream = ZSTD_initDStream(wksp->window_size, wksp->mem, wksp->mem_size); + stream = ZSTD_initStaticDStream(wksp->mem, wksp->mem_size); if (!stream) { ERROR("Failed to initialize zstd decompressor\n"); @@ -122,8 +122,7 @@ static int zstd_uncompress(struct squashfs_sb_info *msblk, void *strm, break; if (ZSTD_isError(zstd_err)) { - ERROR("zstd decompression error: %d\n", - (int)ZSTD_getErrorCode(zstd_err)); + ERROR("zstd decompression error: %s\n", ZSTD_getErrorName(zstd_err)); error = -EIO; break; } -- 2.28.0
[PATCH v3 6/9] f2fs: zstd: Switch to the zstd-1.4.6 API
From: Nick Terrell Move away from the compatibility wrapper to the zstd-1.4.6 API. This code is more efficient because it uses the single-pass API instead of the streaming API. The streaming API is not necessary because the whole input and output buffers are available. This saves memory because we don't need to allocate a buffer for the window. It is also more efficient because it saves unnecessary memcpy calls. Compression memory increases from 168 KB to 204 KB because upstream uses slightly more memory. Decompression memory decreases from 1.4 MB to 158 KB. Signed-off-by: Nick Terrell --- fs/f2fs/compress.c | 102 + 1 file changed, 38 insertions(+), 64 deletions(-) diff --git a/fs/f2fs/compress.c b/fs/f2fs/compress.c index e056f3a2b404..b79efce81651 100644 --- a/fs/f2fs/compress.c +++ b/fs/f2fs/compress.c @@ -11,7 +11,8 @@ #include #include #include -#include +#include +#include #include "f2fs.h" #include "node.h" @@ -298,21 +299,21 @@ static const struct f2fs_compress_ops f2fs_lz4_ops = { static int zstd_init_compress_ctx(struct compress_ctx *cc) { ZSTD_parameters params; - ZSTD_CStream *stream; + ZSTD_CCtx *ctx; void *workspace; unsigned int workspace_size; params = ZSTD_getParams(F2FS_ZSTD_DEFAULT_CLEVEL, cc->rlen, 0); - workspace_size = ZSTD_CStreamWorkspaceBound(params.cParams); + workspace_size = ZSTD_estimateCCtxSize_usingCParams(params.cParams); workspace = f2fs_kvmalloc(F2FS_I_SB(cc->inode), workspace_size, GFP_NOFS); if (!workspace) return -ENOMEM; - stream = ZSTD_initCStream(params, 0, workspace, workspace_size); - if (!stream) { - printk_ratelimited("%sF2FS-fs (%s): %s ZSTD_initCStream failed\n", + ctx = ZSTD_initStaticCCtx(workspace, workspace_size); + if (!ctx) { + printk_ratelimited("%sF2FS-fs (%s): %s ZSTD_inittaticCStream failed\n", KERN_ERR, F2FS_I_SB(cc->inode)->sb->s_id, __func__); kvfree(workspace); @@ -320,7 +321,7 @@ static int zstd_init_compress_ctx(struct compress_ctx *cc) } cc->private = workspace; - cc->private2 = stream; + cc->private2 = ctx; cc->clen = cc->rlen - PAGE_SIZE - COMPRESS_HEADER_SIZE; return 0; @@ -335,65 +336,48 @@ static void zstd_destroy_compress_ctx(struct compress_ctx *cc) static int zstd_compress_pages(struct compress_ctx *cc) { - ZSTD_CStream *stream = cc->private2; - ZSTD_inBuffer inbuf; - ZSTD_outBuffer outbuf; - int src_size = cc->rlen; - int dst_size = src_size - PAGE_SIZE - COMPRESS_HEADER_SIZE; - int ret; - - inbuf.pos = 0; - inbuf.src = cc->rbuf; - inbuf.size = src_size; - - outbuf.pos = 0; - outbuf.dst = cc->cbuf->cdata; - outbuf.size = dst_size; - - ret = ZSTD_compressStream(stream, , ); - if (ZSTD_isError(ret)) { - printk_ratelimited("%sF2FS-fs (%s): %s ZSTD_compressStream failed, ret: %d\n", - KERN_ERR, F2FS_I_SB(cc->inode)->sb->s_id, - __func__, ZSTD_getErrorCode(ret)); - return -EIO; - } - - ret = ZSTD_endStream(stream, ); + ZSTD_CCtx *ctx = cc->private2; + const size_t src_size = cc->rlen; + const size_t dst_size = src_size - PAGE_SIZE - COMPRESS_HEADER_SIZE; + ZSTD_parameters params = ZSTD_getParams(F2FS_ZSTD_DEFAULT_CLEVEL, src_size, 0); + size_t ret; + + ret = ZSTD_compress_advanced( + ctx, cc->cbuf->cdata, dst_size, cc->rbuf, src_size, NULL, 0, params); if (ZSTD_isError(ret)) { - printk_ratelimited("%sF2FS-fs (%s): %s ZSTD_endStream returned %d\n", + /* +* there is compressed data remained in intermediate buffer due to +* no more space in cbuf.cdata +*/ + if (ZSTD_getErrorCode(ret) == ZSTD_error_dstSize_tooSmall) + return -EAGAIN; + /* other compression errors return -EIO */ + printk_ratelimited("%sF2FS-fs (%s): %s ZSTD_compress_advanced failed, err: %s\n", KERN_ERR, F2FS_I_SB(cc->inode)->sb->s_id, - __func__, ZSTD_getErrorCode(ret)); + __func__, ZSTD_getErrorName(ret)); return -EIO; } - /* -* there is compressed data remained in intermediate buffer due to -* no more space in cbuf.cdata -*/ - if (ret) - return -EAGAIN; - - cc->clen = outbuf.
[PATCH v3 9/9] lib: zstd: Remove zstd compatibility wrapper
From: Nick Terrell All callers have been transitioned to the new zstd-1.4.6 API. There are no more callers of the zstd compatibility wrapper, so delete it. Signed-off-by: Nick Terrell --- include/linux/zstd_compat.h | 116 1 file changed, 116 deletions(-) delete mode 100644 include/linux/zstd_compat.h diff --git a/include/linux/zstd_compat.h b/include/linux/zstd_compat.h deleted file mode 100644 index cda9208bf04a.. --- a/include/linux/zstd_compat.h +++ /dev/null @@ -1,116 +0,0 @@ -/* - * Copyright (c) 2016-present, Facebook, Inc. - * All rights reserved. - * - * This source code is licensed under the BSD-style license found in the - * LICENSE file in the root directory of https://github.com/facebook/zstd. - * An additional grant of patent rights can be found in the PATENTS file in the - * same directory. - * - * This program is free software; you can redistribute it and/or modify it under - * the terms of the GNU General Public License version 2 as published by the - * Free Software Foundation. This program is dual-licensed; you may select - * either version 2 of the GNU General Public License ("GPL") or BSD license - * ("BSD"). - */ - -#ifndef ZSTD_COMPAT_H -#define ZSTD_COMPAT_H - -#include - -#if defined(ZSTD_VERSION_NUMBER) && (ZSTD_VERSION_NUMBER >= 10406) -/* - * This header provides backwards compatibility for the zstd-1.4.6 library - * upgrade. This header allows us to upgrade the zstd library version without - * modifying any callers. Then we will migrate callers from the compatibility - * wrapper one at a time until none remain. At which point we will delete this - * header. - * - * It is temporary and will be deleted once the upgrade is complete. - */ - -#include - -static inline size_t ZSTD_CCtxWorkspaceBound(ZSTD_compressionParameters compression_params) -{ -return ZSTD_estimateCCtxSize_usingCParams(compression_params); -} - -static inline size_t ZSTD_CStreamWorkspaceBound(ZSTD_compressionParameters compression_params) -{ -return ZSTD_estimateCStreamSize_usingCParams(compression_params); -} - -static inline size_t ZSTD_DCtxWorkspaceBound(void) -{ -return ZSTD_estimateDCtxSize(); -} - -static inline size_t ZSTD_DStreamWorkspaceBound(unsigned long long window_size) -{ -return ZSTD_estimateDStreamSize(window_size); -} - -static inline ZSTD_CCtx* ZSTD_initCCtx(void* wksp, size_t wksp_size) -{ -if (wksp == NULL) -return NULL; -return ZSTD_initStaticCCtx(wksp, wksp_size); -} - -static inline ZSTD_CStream* ZSTD_initCStream_compat(ZSTD_parameters params, uint64_t pledged_src_size, void* wksp, size_t wksp_size) -{ -ZSTD_CStream* cstream; -size_t ret; - -if (wksp == NULL) -return NULL; - -cstream = ZSTD_initStaticCStream(wksp, wksp_size); -if (cstream == NULL) -return NULL; - -/* 0 means unknown in old API but means 0 in new API */ -if (pledged_src_size == 0) -pledged_src_size = ZSTD_CONTENTSIZE_UNKNOWN; - -ret = ZSTD_initCStream_advanced(cstream, NULL, 0, params, pledged_src_size); -if (ZSTD_isError(ret)) -return NULL; - -return cstream; -} -#define ZSTD_initCStream ZSTD_initCStream_compat - -static inline ZSTD_DCtx* ZSTD_initDCtx(void* wksp, size_t wksp_size) -{ -if (wksp == NULL) -return NULL; -return ZSTD_initStaticDCtx(wksp, wksp_size); -} - -static inline ZSTD_DStream* ZSTD_initDStream_compat(unsigned long long window_size, void* wksp, size_t wksp_size) -{ -if (wksp == NULL) -return NULL; -(void)window_size; -return ZSTD_initStaticDStream(wksp, wksp_size); -} -#define ZSTD_initDStream ZSTD_initDStream_compat - -typedef ZSTD_frameHeader ZSTD_frameParams; - -static inline size_t ZSTD_getFrameParams(ZSTD_frameParams* frame_params, const void* src, size_t src_size) -{ -return ZSTD_getFrameHeader(frame_params, src, src_size); -} - -static inline size_t ZSTD_compressCCtx_compat(ZSTD_CCtx* cctx, void* dst, size_t dst_capacity, const void* src, size_t src_size, ZSTD_parameters params) -{ -return ZSTD_compress_advanced(cctx, dst, dst_capacity, src, src_size, NULL, 0, params); -} -#define ZSTD_compressCCtx ZSTD_compressCCtx_compat - -#endif /* ZSTD_VERSION_NUMBER >= 10406 */ -#endif /* ZSTD_COMPAT_H */ -- 2.28.0
[PATCH v3 8/9] lib: unzstd: Switch to the zstd-1.4.6 API
From: Nick Terrell Move away from the compatibility wrapper to the zstd-1.4.6 API. This code is functionally equivalent. Signed-off-by: Nick Terrell --- lib/decompress_unzstd.c | 40 ++-- 1 file changed, 14 insertions(+), 26 deletions(-) diff --git a/lib/decompress_unzstd.c b/lib/decompress_unzstd.c index a79f705f236d..d4685df0e120 100644 --- a/lib/decompress_unzstd.c +++ b/lib/decompress_unzstd.c @@ -73,7 +73,8 @@ #include #include -#include +#include +#include /* 128MB is the maximum window size supported by zstd. */ #define ZSTD_WINDOWSIZE_MAX(1 << ZSTD_WINDOWLOG_MAX) @@ -120,9 +121,9 @@ static int INIT decompress_single(const u8 *in_buf, long in_len, u8 *out_buf, long out_len, long *in_pos, void (*error)(char *x)) { - const size_t wksp_size = ZSTD_DCtxWorkspaceBound(); + const size_t wksp_size = ZSTD_estimateDCtxSize(); void *wksp = large_malloc(wksp_size); - ZSTD_DCtx *dctx = ZSTD_initDCtx(wksp, wksp_size); + ZSTD_DCtx *dctx = ZSTD_initStaticDCtx(wksp, wksp_size); int err; size_t ret; @@ -165,7 +166,6 @@ static int INIT __unzstd(unsigned char *in_buf, long in_len, { ZSTD_inBuffer in; ZSTD_outBuffer out; - ZSTD_frameParams params; void *in_allocated = NULL; void *out_allocated = NULL; void *wksp = NULL; @@ -229,36 +229,24 @@ static int INIT __unzstd(unsigned char *in_buf, long in_len, out.size = out_len; /* -* We need to know the window size to allocate the ZSTD_DStream. -* Since we are streaming, we need to allocate a buffer for the sliding -* window. The window size varies from 1 KB to ZSTD_WINDOWSIZE_MAX -* (8 MB), so it is important to use the actual value so as not to -* waste memory when it is smaller. +* Zstd determines the workspace size from the window size written +* into the frame header. This ensures that we use the minimum value +* possible, since the window size varies from 1 KB to ZSTD_WINDOWSIZE_MAX +* (1 GB), so it is very important to use the actual value. */ - ret = ZSTD_getFrameParams(, in.src, in.size); + wksp_size = ZSTD_estimateDStreamSize_fromFrame(in.src, in.size); err = handle_zstd_error(ret, error); if (err) goto out; - if (ret != 0) { - error("ZSTD-compressed data has an incomplete frame header"); - err = -1; - goto out; - } - if (params.windowSize > ZSTD_WINDOWSIZE_MAX) { - error("ZSTD-compressed data has too large a window size"); + wksp = large_malloc(wksp_size); + if (wksp == NULL) { + error("Out of memory while allocating ZSTD_DStream"); err = -1; goto out; } - - /* -* Allocate the ZSTD_DStream now that we know how much memory is -* required. -*/ - wksp_size = ZSTD_DStreamWorkspaceBound(params.windowSize); - wksp = large_malloc(wksp_size); - dstream = ZSTD_initDStream(params.windowSize, wksp, wksp_size); + dstream = ZSTD_initStaticDStream(wksp, wksp_size); if (dstream == NULL) { - error("Out of memory while allocating ZSTD_DStream"); + error("ZSTD_initStaticDStream failed"); err = -1; goto out; } -- 2.28.0
[PATCH v3 5/9] btrfs: zstd: Switch to the zstd-1.4.6 API
From: Nick Terrell Move away from the compatibility wrapper to the zstd-1.4.6 API. This code is functionally equivalent. Signed-off-by: Nick Terrell --- fs/btrfs/zstd.c | 48 1 file changed, 28 insertions(+), 20 deletions(-) diff --git a/fs/btrfs/zstd.c b/fs/btrfs/zstd.c index a7367ff573d4..6b466e090cd7 100644 --- a/fs/btrfs/zstd.c +++ b/fs/btrfs/zstd.c @@ -16,7 +16,7 @@ #include #include #include -#include +#include #include "misc.h" #include "compression.h" #include "ctree.h" @@ -159,8 +159,8 @@ static void zstd_calc_ws_mem_sizes(void) zstd_get_btrfs_parameters(level, ZSTD_BTRFS_MAX_INPUT); size_t level_size = max_t(size_t, - ZSTD_CStreamWorkspaceBound(params.cParams), - ZSTD_DStreamWorkspaceBound(ZSTD_BTRFS_MAX_INPUT)); + ZSTD_estimateCStreamSize_usingCParams(params.cParams), + ZSTD_estimateDStreamSize(ZSTD_BTRFS_MAX_INPUT)); max_size = max_t(size_t, max_size, level_size); zstd_ws_mem_sizes[level - 1] = max_size; @@ -389,13 +389,23 @@ int zstd_compress_pages(struct list_head *ws, struct address_space *mapping, *total_in = 0; /* Initialize the stream */ - stream = ZSTD_initCStream(params, len, workspace->mem, - workspace->size); + stream = ZSTD_initStaticCStream(workspace->mem, workspace->size); if (!stream) { - pr_warn("BTRFS: ZSTD_initCStream failed\n"); + pr_warn("BTRFS: ZSTD_initStaticCStream failed\n"); ret = -EIO; goto out; } + { + size_t ret2; + + ret2 = ZSTD_initCStream_advanced(stream, NULL, 0, params, len); + if (ZSTD_isError(ret2)) { + pr_warn("BTRFS: ZSTD_initCStream_advanced returned %s\n", + ZSTD_getErrorName(ret2)); + ret = -EIO; + goto out; + } + } /* map in the first page of input data */ in_page = find_get_page(mapping, start >> PAGE_SHIFT); @@ -421,8 +431,8 @@ int zstd_compress_pages(struct list_head *ws, struct address_space *mapping, ret2 = ZSTD_compressStream(stream, >out_buf, >in_buf); if (ZSTD_isError(ret2)) { - pr_debug("BTRFS: ZSTD_compressStream returned %d\n", - ZSTD_getErrorCode(ret2)); + pr_debug("BTRFS: ZSTD_compressStream returned %s\n", + ZSTD_getErrorName(ret2)); ret = -EIO; goto out; } @@ -489,8 +499,8 @@ int zstd_compress_pages(struct list_head *ws, struct address_space *mapping, ret2 = ZSTD_endStream(stream, >out_buf); if (ZSTD_isError(ret2)) { - pr_debug("BTRFS: ZSTD_endStream returned %d\n", - ZSTD_getErrorCode(ret2)); + pr_debug("BTRFS: ZSTD_endStream returned %s\n", + ZSTD_getErrorName(ret2)); ret = -EIO; goto out; } @@ -557,10 +567,9 @@ int zstd_decompress_bio(struct list_head *ws, struct compressed_bio *cb) unsigned long buf_start; unsigned long total_out = 0; - stream = ZSTD_initDStream( - ZSTD_BTRFS_MAX_INPUT, workspace->mem, workspace->size); + stream = ZSTD_initStaticDStream(workspace->mem, workspace->size); if (!stream) { - pr_debug("BTRFS: ZSTD_initDStream failed\n"); + pr_debug("BTRFS: ZSTD_initStaticDStream failed\n"); ret = -EIO; goto done; } @@ -579,8 +588,8 @@ int zstd_decompress_bio(struct list_head *ws, struct compressed_bio *cb) ret2 = ZSTD_decompressStream(stream, >out_buf, >in_buf); if (ZSTD_isError(ret2)) { - pr_debug("BTRFS: ZSTD_decompressStream returned %d\n", - ZSTD_getErrorCode(ret2)); + pr_debug("BTRFS: ZSTD_decompressStream returned %s\n", + ZSTD_getErrorName(ret2)); ret = -EIO; goto done; } @@ -633,10 +642,9 @@ int zstd_decompress(struct list_head *ws, unsigned char *data_in, unsigned long pg_offset = 0; char *kaddr; -
[PATCH v3 4/9] crypto: zstd: Switch to zstd-1.4.6 API
From: Nick Terrell Move away from the compatibility wrapper to the zstd-1.4.6 API. This code is functionally equivalent. Signed-off-by: Nick Terrell --- crypto/zstd.c | 24 +++- 1 file changed, 11 insertions(+), 13 deletions(-) diff --git a/crypto/zstd.c b/crypto/zstd.c index dcda3cad3b5c..767fe2fbe009 100644 --- a/crypto/zstd.c +++ b/crypto/zstd.c @@ -11,7 +11,7 @@ #include #include #include -#include +#include #include @@ -24,16 +24,15 @@ struct zstd_ctx { void *dwksp; }; -static ZSTD_parameters zstd_params(void) -{ - return ZSTD_getParams(ZSTD_DEF_LEVEL, 0, 0); -} - static int zstd_comp_init(struct zstd_ctx *ctx) { int ret = 0; - const ZSTD_parameters params = zstd_params(); - const size_t wksp_size = ZSTD_CCtxWorkspaceBound(params.cParams); + const size_t wksp_size = ZSTD_estimateCCtxSize(ZSTD_DEF_LEVEL); + + if (ZSTD_isError(wksp_size)) { + ret = -EINVAL; + goto out_free; + } ctx->cwksp = vzalloc(wksp_size); if (!ctx->cwksp) { @@ -41,7 +40,7 @@ static int zstd_comp_init(struct zstd_ctx *ctx) goto out; } - ctx->cctx = ZSTD_initCCtx(ctx->cwksp, wksp_size); + ctx->cctx = ZSTD_initStaticCCtx(ctx->cwksp, wksp_size); if (!ctx->cctx) { ret = -EINVAL; goto out_free; @@ -56,7 +55,7 @@ static int zstd_comp_init(struct zstd_ctx *ctx) static int zstd_decomp_init(struct zstd_ctx *ctx) { int ret = 0; - const size_t wksp_size = ZSTD_DCtxWorkspaceBound(); + const size_t wksp_size = ZSTD_estimateDCtxSize(); ctx->dwksp = vzalloc(wksp_size); if (!ctx->dwksp) { @@ -64,7 +63,7 @@ static int zstd_decomp_init(struct zstd_ctx *ctx) goto out; } - ctx->dctx = ZSTD_initDCtx(ctx->dwksp, wksp_size); + ctx->dctx = ZSTD_initStaticDCtx(ctx->dwksp, wksp_size); if (!ctx->dctx) { ret = -EINVAL; goto out_free; @@ -152,9 +151,8 @@ static int __zstd_compress(const u8 *src, unsigned int slen, { size_t out_len; struct zstd_ctx *zctx = ctx; - const ZSTD_parameters params = zstd_params(); - out_len = ZSTD_compressCCtx(zctx->cctx, dst, *dlen, src, slen, params); + out_len = ZSTD_compressCCtx(zctx->cctx, dst, *dlen, src, slen, ZSTD_DEF_LEVEL); if (ZSTD_isError(out_len)) return -EINVAL; *dlen = out_len; -- 2.28.0
[PATCH v3 2/9] lib: zstd: Add decompress_sources.h for decompress_unzstd
From: Nick Terrell Adds decompress_sources.h which includes every .c file necessary for zstd decompression. This is used in decompress_unzstd.c so the internal structure of the library isn't exposed. This allows us to upgrade the zstd library version without modifying any callers. Instead we just need to update decompress_sources.h. Signed-off-by: Nick Terrell --- lib/decompress_unzstd.c | 6 +- lib/zstd/decompress_sources.h | 14 ++ 2 files changed, 15 insertions(+), 5 deletions(-) create mode 100644 lib/zstd/decompress_sources.h diff --git a/lib/decompress_unzstd.c b/lib/decompress_unzstd.c index dbc290af26b4..a79f705f236d 100644 --- a/lib/decompress_unzstd.c +++ b/lib/decompress_unzstd.c @@ -68,11 +68,7 @@ #ifdef STATIC # define UNZSTD_PREBOOT # include "xxhash.c" -# include "zstd/entropy_common.c" -# include "zstd/fse_decompress.c" -# include "zstd/huf_decompress.c" -# include "zstd/zstd_common.c" -# include "zstd/decompress.c" +# include "zstd/decompress_sources.h" #endif #include diff --git a/lib/zstd/decompress_sources.h b/lib/zstd/decompress_sources.h new file mode 100644 index ..ccb4960ea0cd --- /dev/null +++ b/lib/zstd/decompress_sources.h @@ -0,0 +1,14 @@ +// SPDX-License-Identifier: GPL-2.0-only + +/* + * This file includes every .c file needed for decompression. + * It is used by lib/decompress_unzstd.c to include the decompression + * source into the translation-unit, so it can be used for kernel + * decompression. + */ + +#include "entropy_common.c" +#include "fse_decompress.c" +#include "huf_decompress.c" +#include "zstd_common.c" +#include "decompress.c" -- 2.28.0
[PATCH v3 1/9] lib: zstd: Add zstd compatibility wrapper
From: Nick Terrell Adds zstd_compat.h which provides the necessary functions from the current zstd.h API. It is only active for zstd versions 1.4.6 and newer. That means it is disabled currently, but will become active when a later patch in this series updates the zstd library in the kernel to 1.4.6. This header allows the zstd upgrade to 1.4.6 without changing any callers, since they all include zstd through the compatibility wrapper. Later patches in this series transition each caller away from the compatibility wrapper. After all the callers have been transitioned away from the compatibility wrapper, the final patch in this series deletes it. Signed-off-by: Nick Terrell --- crypto/zstd.c | 2 +- fs/btrfs/zstd.c | 2 +- fs/f2fs/compress.c | 2 +- fs/squashfs/zstd_wrapper.c | 2 +- include/linux/zstd_compat.h | 116 lib/decompress_unzstd.c | 2 +- 6 files changed, 121 insertions(+), 5 deletions(-) create mode 100644 include/linux/zstd_compat.h diff --git a/crypto/zstd.c b/crypto/zstd.c index 1a3309f066f7..dcda3cad3b5c 100644 --- a/crypto/zstd.c +++ b/crypto/zstd.c @@ -11,7 +11,7 @@ #include #include #include -#include +#include #include diff --git a/fs/btrfs/zstd.c b/fs/btrfs/zstd.c index 9a4871636c6c..a7367ff573d4 100644 --- a/fs/btrfs/zstd.c +++ b/fs/btrfs/zstd.c @@ -16,7 +16,7 @@ #include #include #include -#include +#include #include "misc.h" #include "compression.h" #include "ctree.h" diff --git a/fs/f2fs/compress.c b/fs/f2fs/compress.c index 1dfb126a0cb2..e056f3a2b404 100644 --- a/fs/f2fs/compress.c +++ b/fs/f2fs/compress.c @@ -11,7 +11,7 @@ #include #include #include -#include +#include #include "f2fs.h" #include "node.h" diff --git a/fs/squashfs/zstd_wrapper.c b/fs/squashfs/zstd_wrapper.c index b7cb1faa652d..f8c512a6204e 100644 --- a/fs/squashfs/zstd_wrapper.c +++ b/fs/squashfs/zstd_wrapper.c @@ -11,7 +11,7 @@ #include #include #include -#include +#include #include #include "squashfs_fs.h" diff --git a/include/linux/zstd_compat.h b/include/linux/zstd_compat.h new file mode 100644 index ..cda9208bf04a --- /dev/null +++ b/include/linux/zstd_compat.h @@ -0,0 +1,116 @@ +/* + * Copyright (c) 2016-present, Facebook, Inc. + * All rights reserved. + * + * This source code is licensed under the BSD-style license found in the + * LICENSE file in the root directory of https://github.com/facebook/zstd. + * An additional grant of patent rights can be found in the PATENTS file in the + * same directory. + * + * This program is free software; you can redistribute it and/or modify it under + * the terms of the GNU General Public License version 2 as published by the + * Free Software Foundation. This program is dual-licensed; you may select + * either version 2 of the GNU General Public License ("GPL") or BSD license + * ("BSD"). + */ + +#ifndef ZSTD_COMPAT_H +#define ZSTD_COMPAT_H + +#include + +#if defined(ZSTD_VERSION_NUMBER) && (ZSTD_VERSION_NUMBER >= 10406) +/* + * This header provides backwards compatibility for the zstd-1.4.6 library + * upgrade. This header allows us to upgrade the zstd library version without + * modifying any callers. Then we will migrate callers from the compatibility + * wrapper one at a time until none remain. At which point we will delete this + * header. + * + * It is temporary and will be deleted once the upgrade is complete. + */ + +#include + +static inline size_t ZSTD_CCtxWorkspaceBound(ZSTD_compressionParameters compression_params) +{ +return ZSTD_estimateCCtxSize_usingCParams(compression_params); +} + +static inline size_t ZSTD_CStreamWorkspaceBound(ZSTD_compressionParameters compression_params) +{ +return ZSTD_estimateCStreamSize_usingCParams(compression_params); +} + +static inline size_t ZSTD_DCtxWorkspaceBound(void) +{ +return ZSTD_estimateDCtxSize(); +} + +static inline size_t ZSTD_DStreamWorkspaceBound(unsigned long long window_size) +{ +return ZSTD_estimateDStreamSize(window_size); +} + +static inline ZSTD_CCtx* ZSTD_initCCtx(void* wksp, size_t wksp_size) +{ +if (wksp == NULL) +return NULL; +return ZSTD_initStaticCCtx(wksp, wksp_size); +} + +static inline ZSTD_CStream* ZSTD_initCStream_compat(ZSTD_parameters params, uint64_t pledged_src_size, void* wksp, size_t wksp_size) +{ +ZSTD_CStream* cstream; +size_t ret; + +if (wksp == NULL) +return NULL; + +cstream = ZSTD_initStaticCStream(wksp, wksp_size); +if (cstream == NULL) +return NULL; + +/* 0 means unknown in old API but means 0 in new API */ +if (pledged_src_size == 0) +pledged_src_size = ZSTD_CONTENTSIZE_UNKNOWN; + +ret = ZSTD_initCStream_advanced(cstream, NULL, 0, params, pledged_src_size); +if (ZSTD_isError(ret)) +return NULL; + +return cstream; +} +#define ZS
[PATCH v3 0/9] Update to zstd-1.4.6
From: Nick Terrell This patchset upgrades the zstd library to the latest upstream release. The current zstd version in the kernel is a modified version of upstream zstd-1.3.1. At the time it was integrated, zstd wasn't ready to be used in the kernel as-is. But, it is now possible to use upstream zstd directly in the kernel. I have not yet release zstd-1.4.6 upstream. I want the zstd version in the kernel to match up with a known upstream release, so we know exactly what code is running. Whenever this patchset is ready for merge, I will cut a release at the upstream commit that gets merged. This should not be necessary for future releases. The kernel zstd library is automatically generated from upstream zstd. A script makes the necessary changes and imports it into the kernel. The changes are: 1. Replace all libc dependencies with kernel replacements and rewrite includes. 2. Remove unncessary portability macros like: #if defined(_MSC_VER). 3. Use the kernel xxhash instead of bundling it. This automation gets tested every commit by upstream's continuous integration. When we cut a new zstd release, we will submit a patch to the kernel to update the zstd version in the kernel. I've updated zstd to upstream with one big patch because every commit must build, so that precludes partial updates. Since the commit is 100% generated, I hope the review burden is lightened. I considered replaying upstream commits, but that is not possible because there have been ~3500 upstream commits since the last zstd import, and the commits don't all build individually. The bulk update preserves bisectablity because bugs can be bisected to the zstd version update. At that point the update can be reverted, and we can work with upstream to find and fix the bug. After this big switch in how the kernel consumes zstd, future patches will be smaller, because they will only have one upstream release worth of changes each. This patchset changes the zstd API from a custom kernel API to the upstream API. I considered wrapping the upstream API with a wrapper that is closer to the kernel style guide. Following advise from https://lkml.org/lkml/2020/9/17/814 I've chosen to use the upstream API directly, to minimize opportunities to introduce bugs, and because using the upstream API directly makes debugging and communication with upstream easier. This patchset comes in 3 parts: 1. The first 2 patches prepare for the zstd upgrade. The first patch adds a compatibility wrapper so zstd can be upgraded without modifying any callers. The second patch adds an indirection for the lib/decompress_unzstd.c including of all decompression source files. 2. Import zstd-1.4.6. This patch is completely generated from upstream using automated tooling. 3. Update all callers to the zstd-1.4.6 API then delete the compatibility wrapper. I tested every caller of zstd on x86_64. I tested both after the 1.4.6 upgrade using the compatibility wrapper, and after the final patch in this series. I tested kernel and initramfs decompression in i386 and arm. I ran benchmarks to compare the current zstd in the kernel with zstd-1.4.6. I benchmarked on x86_64 using QEMU with KVM enabled on an Intel i9-9900k. I found: * BtrFS zstd compression at levels 1 and 3 is 5% faster * BtrFS zstd decompression+read is 15% faster * SquashFS zstd decompression+read is 15% faster * F2FS zstd compression+write at level 3 is 8% faster * F2FS zstd decompression+read is 20% faster * ZRAM decompression+read is 30% faster * Kernel zstd decompression is 35% faster * Initramfs zstd decompression+build is 5% faster The latest zstd also offers bug fixes and a 1 KB reduction in stack uage during compression. Please let me know if there is anything that I can do to ease the way for these patches. I think it is important because it gets large performance improvements, contains bug fixes, and is switching to a more maintainable model of consuming upstream zstd directly, making it easy to keep up to date. Best, Nick Terrell v1 -> v2: * Successfully tested F2FS with help from Chao Yu to fix my test. * (1/9) Fix ZSTD_initCStream() wrapper to handle pledged_src_size=0 means unknown. This fixes F2FS with the zstd-1.4.6 compatibility wrapper, exposed by the test. v2 -> v3: * (3/9) Silence warnings by Kernel Test Robot: https://github.com/facebook/zstd/pull/2324 Stack size warnings remain, but these aren't new, and the functions it warns on are either unused or not in the maximum stack path. This patchset reduces zstd compression stack usage by 1 KB overall. I've gotten the low hanging fruit, and more stack reduction would require significant changes that have the potential to introduce new bugs. However, I do hope to continue to reduce zstd stack usage in future versions. Nick Terrell (9): lib: zstd: Add zstd compatibility wrapper lib: zstd: Add decompress_sources.h for decompress_unzstd lib: zstd: Upgrade to latest upstream zstd version
[PATCH v2 4/9] crypto: zstd: Switch to zstd-1.4.6 API
From: Nick Terrell Move away from the compatibility wrapper to the zstd-1.4.6 API. This code is functionally equivalent. Signed-off-by: Nick Terrell --- crypto/zstd.c | 24 +++- 1 file changed, 11 insertions(+), 13 deletions(-) diff --git a/crypto/zstd.c b/crypto/zstd.c index dcda3cad3b5c..767fe2fbe009 100644 --- a/crypto/zstd.c +++ b/crypto/zstd.c @@ -11,7 +11,7 @@ #include #include #include -#include +#include #include @@ -24,16 +24,15 @@ struct zstd_ctx { void *dwksp; }; -static ZSTD_parameters zstd_params(void) -{ - return ZSTD_getParams(ZSTD_DEF_LEVEL, 0, 0); -} - static int zstd_comp_init(struct zstd_ctx *ctx) { int ret = 0; - const ZSTD_parameters params = zstd_params(); - const size_t wksp_size = ZSTD_CCtxWorkspaceBound(params.cParams); + const size_t wksp_size = ZSTD_estimateCCtxSize(ZSTD_DEF_LEVEL); + + if (ZSTD_isError(wksp_size)) { + ret = -EINVAL; + goto out_free; + } ctx->cwksp = vzalloc(wksp_size); if (!ctx->cwksp) { @@ -41,7 +40,7 @@ static int zstd_comp_init(struct zstd_ctx *ctx) goto out; } - ctx->cctx = ZSTD_initCCtx(ctx->cwksp, wksp_size); + ctx->cctx = ZSTD_initStaticCCtx(ctx->cwksp, wksp_size); if (!ctx->cctx) { ret = -EINVAL; goto out_free; @@ -56,7 +55,7 @@ static int zstd_comp_init(struct zstd_ctx *ctx) static int zstd_decomp_init(struct zstd_ctx *ctx) { int ret = 0; - const size_t wksp_size = ZSTD_DCtxWorkspaceBound(); + const size_t wksp_size = ZSTD_estimateDCtxSize(); ctx->dwksp = vzalloc(wksp_size); if (!ctx->dwksp) { @@ -64,7 +63,7 @@ static int zstd_decomp_init(struct zstd_ctx *ctx) goto out; } - ctx->dctx = ZSTD_initDCtx(ctx->dwksp, wksp_size); + ctx->dctx = ZSTD_initStaticDCtx(ctx->dwksp, wksp_size); if (!ctx->dctx) { ret = -EINVAL; goto out_free; @@ -152,9 +151,8 @@ static int __zstd_compress(const u8 *src, unsigned int slen, { size_t out_len; struct zstd_ctx *zctx = ctx; - const ZSTD_parameters params = zstd_params(); - out_len = ZSTD_compressCCtx(zctx->cctx, dst, *dlen, src, slen, params); + out_len = ZSTD_compressCCtx(zctx->cctx, dst, *dlen, src, slen, ZSTD_DEF_LEVEL); if (ZSTD_isError(out_len)) return -EINVAL; *dlen = out_len; -- 2.28.0
[PATCH v2 7/9] squashfs: zstd: Switch to the zstd-1.4.6 API
From: Nick Terrell Move away from the compatibility wrapper to the zstd-1.4.6 API. This code is functionally equivalent. Signed-off-by: Nick Terrell --- fs/squashfs/zstd_wrapper.c | 9 - 1 file changed, 4 insertions(+), 5 deletions(-) diff --git a/fs/squashfs/zstd_wrapper.c b/fs/squashfs/zstd_wrapper.c index f8c512a6204e..add582409866 100644 --- a/fs/squashfs/zstd_wrapper.c +++ b/fs/squashfs/zstd_wrapper.c @@ -11,7 +11,7 @@ #include #include #include -#include +#include #include #include "squashfs_fs.h" @@ -34,7 +34,7 @@ static void *zstd_init(struct squashfs_sb_info *msblk, void *buff) goto failed; wksp->window_size = max_t(size_t, msblk->block_size, SQUASHFS_METADATA_SIZE); - wksp->mem_size = ZSTD_DStreamWorkspaceBound(wksp->window_size); + wksp->mem_size = ZSTD_estimateDStreamSize(wksp->window_size); wksp->mem = vmalloc(wksp->mem_size); if (wksp->mem == NULL) goto failed; @@ -71,7 +71,7 @@ static int zstd_uncompress(struct squashfs_sb_info *msblk, void *strm, struct bvec_iter_all iter_all = {}; struct bio_vec *bvec = bvec_init_iter_all(_all); - stream = ZSTD_initDStream(wksp->window_size, wksp->mem, wksp->mem_size); + stream = ZSTD_initStaticDStream(wksp->mem, wksp->mem_size); if (!stream) { ERROR("Failed to initialize zstd decompressor\n"); @@ -122,8 +122,7 @@ static int zstd_uncompress(struct squashfs_sb_info *msblk, void *strm, break; if (ZSTD_isError(zstd_err)) { - ERROR("zstd decompression error: %d\n", - (int)ZSTD_getErrorCode(zstd_err)); + ERROR("zstd decompression error: %s\n", ZSTD_getErrorName(zstd_err)); error = -EIO; break; } -- 2.28.0
[PATCH v2 9/9] lib: zstd: Remove zstd compatibility wrapper
From: Nick Terrell All callers have been transitioned to the new zstd-1.4.6 API. There are no more callers of the zstd compatibility wrapper, so delete it. Signed-off-by: Nick Terrell --- include/linux/zstd_compat.h | 116 1 file changed, 116 deletions(-) delete mode 100644 include/linux/zstd_compat.h diff --git a/include/linux/zstd_compat.h b/include/linux/zstd_compat.h deleted file mode 100644 index cda9208bf04a.. --- a/include/linux/zstd_compat.h +++ /dev/null @@ -1,116 +0,0 @@ -/* - * Copyright (c) 2016-present, Facebook, Inc. - * All rights reserved. - * - * This source code is licensed under the BSD-style license found in the - * LICENSE file in the root directory of https://github.com/facebook/zstd. - * An additional grant of patent rights can be found in the PATENTS file in the - * same directory. - * - * This program is free software; you can redistribute it and/or modify it under - * the terms of the GNU General Public License version 2 as published by the - * Free Software Foundation. This program is dual-licensed; you may select - * either version 2 of the GNU General Public License ("GPL") or BSD license - * ("BSD"). - */ - -#ifndef ZSTD_COMPAT_H -#define ZSTD_COMPAT_H - -#include - -#if defined(ZSTD_VERSION_NUMBER) && (ZSTD_VERSION_NUMBER >= 10406) -/* - * This header provides backwards compatibility for the zstd-1.4.6 library - * upgrade. This header allows us to upgrade the zstd library version without - * modifying any callers. Then we will migrate callers from the compatibility - * wrapper one at a time until none remain. At which point we will delete this - * header. - * - * It is temporary and will be deleted once the upgrade is complete. - */ - -#include - -static inline size_t ZSTD_CCtxWorkspaceBound(ZSTD_compressionParameters compression_params) -{ -return ZSTD_estimateCCtxSize_usingCParams(compression_params); -} - -static inline size_t ZSTD_CStreamWorkspaceBound(ZSTD_compressionParameters compression_params) -{ -return ZSTD_estimateCStreamSize_usingCParams(compression_params); -} - -static inline size_t ZSTD_DCtxWorkspaceBound(void) -{ -return ZSTD_estimateDCtxSize(); -} - -static inline size_t ZSTD_DStreamWorkspaceBound(unsigned long long window_size) -{ -return ZSTD_estimateDStreamSize(window_size); -} - -static inline ZSTD_CCtx* ZSTD_initCCtx(void* wksp, size_t wksp_size) -{ -if (wksp == NULL) -return NULL; -return ZSTD_initStaticCCtx(wksp, wksp_size); -} - -static inline ZSTD_CStream* ZSTD_initCStream_compat(ZSTD_parameters params, uint64_t pledged_src_size, void* wksp, size_t wksp_size) -{ -ZSTD_CStream* cstream; -size_t ret; - -if (wksp == NULL) -return NULL; - -cstream = ZSTD_initStaticCStream(wksp, wksp_size); -if (cstream == NULL) -return NULL; - -/* 0 means unknown in old API but means 0 in new API */ -if (pledged_src_size == 0) -pledged_src_size = ZSTD_CONTENTSIZE_UNKNOWN; - -ret = ZSTD_initCStream_advanced(cstream, NULL, 0, params, pledged_src_size); -if (ZSTD_isError(ret)) -return NULL; - -return cstream; -} -#define ZSTD_initCStream ZSTD_initCStream_compat - -static inline ZSTD_DCtx* ZSTD_initDCtx(void* wksp, size_t wksp_size) -{ -if (wksp == NULL) -return NULL; -return ZSTD_initStaticDCtx(wksp, wksp_size); -} - -static inline ZSTD_DStream* ZSTD_initDStream_compat(unsigned long long window_size, void* wksp, size_t wksp_size) -{ -if (wksp == NULL) -return NULL; -(void)window_size; -return ZSTD_initStaticDStream(wksp, wksp_size); -} -#define ZSTD_initDStream ZSTD_initDStream_compat - -typedef ZSTD_frameHeader ZSTD_frameParams; - -static inline size_t ZSTD_getFrameParams(ZSTD_frameParams* frame_params, const void* src, size_t src_size) -{ -return ZSTD_getFrameHeader(frame_params, src, src_size); -} - -static inline size_t ZSTD_compressCCtx_compat(ZSTD_CCtx* cctx, void* dst, size_t dst_capacity, const void* src, size_t src_size, ZSTD_parameters params) -{ -return ZSTD_compress_advanced(cctx, dst, dst_capacity, src, src_size, NULL, 0, params); -} -#define ZSTD_compressCCtx ZSTD_compressCCtx_compat - -#endif /* ZSTD_VERSION_NUMBER >= 10406 */ -#endif /* ZSTD_COMPAT_H */ -- 2.28.0
[PATCH v2 6/9] f2fs: zstd: Switch to the zstd-1.4.6 API
From: Nick Terrell Move away from the compatibility wrapper to the zstd-1.4.6 API. This code is more efficient because it uses the single-pass API instead of the streaming API. The streaming API is not necessary because the whole input and output buffers are available. This saves memory because we don't need to allocate a buffer for the window. It is also more efficient because it saves unnecessary memcpy calls. Compression memory increases from 168 KB to 204 KB because upstream uses slightly more memory. Decompression memory decreases from 1.4 MB to 158 KB. Signed-off-by: Nick Terrell --- fs/f2fs/compress.c | 102 + 1 file changed, 38 insertions(+), 64 deletions(-) diff --git a/fs/f2fs/compress.c b/fs/f2fs/compress.c index e056f3a2b404..b79efce81651 100644 --- a/fs/f2fs/compress.c +++ b/fs/f2fs/compress.c @@ -11,7 +11,8 @@ #include #include #include -#include +#include +#include #include "f2fs.h" #include "node.h" @@ -298,21 +299,21 @@ static const struct f2fs_compress_ops f2fs_lz4_ops = { static int zstd_init_compress_ctx(struct compress_ctx *cc) { ZSTD_parameters params; - ZSTD_CStream *stream; + ZSTD_CCtx *ctx; void *workspace; unsigned int workspace_size; params = ZSTD_getParams(F2FS_ZSTD_DEFAULT_CLEVEL, cc->rlen, 0); - workspace_size = ZSTD_CStreamWorkspaceBound(params.cParams); + workspace_size = ZSTD_estimateCCtxSize_usingCParams(params.cParams); workspace = f2fs_kvmalloc(F2FS_I_SB(cc->inode), workspace_size, GFP_NOFS); if (!workspace) return -ENOMEM; - stream = ZSTD_initCStream(params, 0, workspace, workspace_size); - if (!stream) { - printk_ratelimited("%sF2FS-fs (%s): %s ZSTD_initCStream failed\n", + ctx = ZSTD_initStaticCCtx(workspace, workspace_size); + if (!ctx) { + printk_ratelimited("%sF2FS-fs (%s): %s ZSTD_inittaticCStream failed\n", KERN_ERR, F2FS_I_SB(cc->inode)->sb->s_id, __func__); kvfree(workspace); @@ -320,7 +321,7 @@ static int zstd_init_compress_ctx(struct compress_ctx *cc) } cc->private = workspace; - cc->private2 = stream; + cc->private2 = ctx; cc->clen = cc->rlen - PAGE_SIZE - COMPRESS_HEADER_SIZE; return 0; @@ -335,65 +336,48 @@ static void zstd_destroy_compress_ctx(struct compress_ctx *cc) static int zstd_compress_pages(struct compress_ctx *cc) { - ZSTD_CStream *stream = cc->private2; - ZSTD_inBuffer inbuf; - ZSTD_outBuffer outbuf; - int src_size = cc->rlen; - int dst_size = src_size - PAGE_SIZE - COMPRESS_HEADER_SIZE; - int ret; - - inbuf.pos = 0; - inbuf.src = cc->rbuf; - inbuf.size = src_size; - - outbuf.pos = 0; - outbuf.dst = cc->cbuf->cdata; - outbuf.size = dst_size; - - ret = ZSTD_compressStream(stream, , ); - if (ZSTD_isError(ret)) { - printk_ratelimited("%sF2FS-fs (%s): %s ZSTD_compressStream failed, ret: %d\n", - KERN_ERR, F2FS_I_SB(cc->inode)->sb->s_id, - __func__, ZSTD_getErrorCode(ret)); - return -EIO; - } - - ret = ZSTD_endStream(stream, ); + ZSTD_CCtx *ctx = cc->private2; + const size_t src_size = cc->rlen; + const size_t dst_size = src_size - PAGE_SIZE - COMPRESS_HEADER_SIZE; + ZSTD_parameters params = ZSTD_getParams(F2FS_ZSTD_DEFAULT_CLEVEL, src_size, 0); + size_t ret; + + ret = ZSTD_compress_advanced( + ctx, cc->cbuf->cdata, dst_size, cc->rbuf, src_size, NULL, 0, params); if (ZSTD_isError(ret)) { - printk_ratelimited("%sF2FS-fs (%s): %s ZSTD_endStream returned %d\n", + /* +* there is compressed data remained in intermediate buffer due to +* no more space in cbuf.cdata +*/ + if (ZSTD_getErrorCode(ret) == ZSTD_error_dstSize_tooSmall) + return -EAGAIN; + /* other compression errors return -EIO */ + printk_ratelimited("%sF2FS-fs (%s): %s ZSTD_compress_advanced failed, err: %s\n", KERN_ERR, F2FS_I_SB(cc->inode)->sb->s_id, - __func__, ZSTD_getErrorCode(ret)); + __func__, ZSTD_getErrorName(ret)); return -EIO; } - /* -* there is compressed data remained in intermediate buffer due to -* no more space in cbuf.cdata -*/ - if (ret) - return -EAGAIN; - - cc->clen = outbuf.
[PATCH v2 8/9] lib: unzstd: Switch to the zstd-1.4.6 API
From: Nick Terrell Move away from the compatibility wrapper to the zstd-1.4.6 API. This code is functionally equivalent. Signed-off-by: Nick Terrell --- lib/decompress_unzstd.c | 40 ++-- 1 file changed, 14 insertions(+), 26 deletions(-) diff --git a/lib/decompress_unzstd.c b/lib/decompress_unzstd.c index a79f705f236d..d4685df0e120 100644 --- a/lib/decompress_unzstd.c +++ b/lib/decompress_unzstd.c @@ -73,7 +73,8 @@ #include #include -#include +#include +#include /* 128MB is the maximum window size supported by zstd. */ #define ZSTD_WINDOWSIZE_MAX(1 << ZSTD_WINDOWLOG_MAX) @@ -120,9 +121,9 @@ static int INIT decompress_single(const u8 *in_buf, long in_len, u8 *out_buf, long out_len, long *in_pos, void (*error)(char *x)) { - const size_t wksp_size = ZSTD_DCtxWorkspaceBound(); + const size_t wksp_size = ZSTD_estimateDCtxSize(); void *wksp = large_malloc(wksp_size); - ZSTD_DCtx *dctx = ZSTD_initDCtx(wksp, wksp_size); + ZSTD_DCtx *dctx = ZSTD_initStaticDCtx(wksp, wksp_size); int err; size_t ret; @@ -165,7 +166,6 @@ static int INIT __unzstd(unsigned char *in_buf, long in_len, { ZSTD_inBuffer in; ZSTD_outBuffer out; - ZSTD_frameParams params; void *in_allocated = NULL; void *out_allocated = NULL; void *wksp = NULL; @@ -229,36 +229,24 @@ static int INIT __unzstd(unsigned char *in_buf, long in_len, out.size = out_len; /* -* We need to know the window size to allocate the ZSTD_DStream. -* Since we are streaming, we need to allocate a buffer for the sliding -* window. The window size varies from 1 KB to ZSTD_WINDOWSIZE_MAX -* (8 MB), so it is important to use the actual value so as not to -* waste memory when it is smaller. +* Zstd determines the workspace size from the window size written +* into the frame header. This ensures that we use the minimum value +* possible, since the window size varies from 1 KB to ZSTD_WINDOWSIZE_MAX +* (1 GB), so it is very important to use the actual value. */ - ret = ZSTD_getFrameParams(, in.src, in.size); + wksp_size = ZSTD_estimateDStreamSize_fromFrame(in.src, in.size); err = handle_zstd_error(ret, error); if (err) goto out; - if (ret != 0) { - error("ZSTD-compressed data has an incomplete frame header"); - err = -1; - goto out; - } - if (params.windowSize > ZSTD_WINDOWSIZE_MAX) { - error("ZSTD-compressed data has too large a window size"); + wksp = large_malloc(wksp_size); + if (wksp == NULL) { + error("Out of memory while allocating ZSTD_DStream"); err = -1; goto out; } - - /* -* Allocate the ZSTD_DStream now that we know how much memory is -* required. -*/ - wksp_size = ZSTD_DStreamWorkspaceBound(params.windowSize); - wksp = large_malloc(wksp_size); - dstream = ZSTD_initDStream(params.windowSize, wksp, wksp_size); + dstream = ZSTD_initStaticDStream(wksp, wksp_size); if (dstream == NULL) { - error("Out of memory while allocating ZSTD_DStream"); + error("ZSTD_initStaticDStream failed"); err = -1; goto out; } -- 2.28.0
[PATCH v2 5/9] btrfs: zstd: Switch to the zstd-1.4.6 API
From: Nick Terrell Move away from the compatibility wrapper to the zstd-1.4.6 API. This code is functionally equivalent. Signed-off-by: Nick Terrell --- fs/btrfs/zstd.c | 48 1 file changed, 28 insertions(+), 20 deletions(-) diff --git a/fs/btrfs/zstd.c b/fs/btrfs/zstd.c index a7367ff573d4..6b466e090cd7 100644 --- a/fs/btrfs/zstd.c +++ b/fs/btrfs/zstd.c @@ -16,7 +16,7 @@ #include #include #include -#include +#include #include "misc.h" #include "compression.h" #include "ctree.h" @@ -159,8 +159,8 @@ static void zstd_calc_ws_mem_sizes(void) zstd_get_btrfs_parameters(level, ZSTD_BTRFS_MAX_INPUT); size_t level_size = max_t(size_t, - ZSTD_CStreamWorkspaceBound(params.cParams), - ZSTD_DStreamWorkspaceBound(ZSTD_BTRFS_MAX_INPUT)); + ZSTD_estimateCStreamSize_usingCParams(params.cParams), + ZSTD_estimateDStreamSize(ZSTD_BTRFS_MAX_INPUT)); max_size = max_t(size_t, max_size, level_size); zstd_ws_mem_sizes[level - 1] = max_size; @@ -389,13 +389,23 @@ int zstd_compress_pages(struct list_head *ws, struct address_space *mapping, *total_in = 0; /* Initialize the stream */ - stream = ZSTD_initCStream(params, len, workspace->mem, - workspace->size); + stream = ZSTD_initStaticCStream(workspace->mem, workspace->size); if (!stream) { - pr_warn("BTRFS: ZSTD_initCStream failed\n"); + pr_warn("BTRFS: ZSTD_initStaticCStream failed\n"); ret = -EIO; goto out; } + { + size_t ret2; + + ret2 = ZSTD_initCStream_advanced(stream, NULL, 0, params, len); + if (ZSTD_isError(ret2)) { + pr_warn("BTRFS: ZSTD_initCStream_advanced returned %s\n", + ZSTD_getErrorName(ret2)); + ret = -EIO; + goto out; + } + } /* map in the first page of input data */ in_page = find_get_page(mapping, start >> PAGE_SHIFT); @@ -421,8 +431,8 @@ int zstd_compress_pages(struct list_head *ws, struct address_space *mapping, ret2 = ZSTD_compressStream(stream, >out_buf, >in_buf); if (ZSTD_isError(ret2)) { - pr_debug("BTRFS: ZSTD_compressStream returned %d\n", - ZSTD_getErrorCode(ret2)); + pr_debug("BTRFS: ZSTD_compressStream returned %s\n", + ZSTD_getErrorName(ret2)); ret = -EIO; goto out; } @@ -489,8 +499,8 @@ int zstd_compress_pages(struct list_head *ws, struct address_space *mapping, ret2 = ZSTD_endStream(stream, >out_buf); if (ZSTD_isError(ret2)) { - pr_debug("BTRFS: ZSTD_endStream returned %d\n", - ZSTD_getErrorCode(ret2)); + pr_debug("BTRFS: ZSTD_endStream returned %s\n", + ZSTD_getErrorName(ret2)); ret = -EIO; goto out; } @@ -557,10 +567,9 @@ int zstd_decompress_bio(struct list_head *ws, struct compressed_bio *cb) unsigned long buf_start; unsigned long total_out = 0; - stream = ZSTD_initDStream( - ZSTD_BTRFS_MAX_INPUT, workspace->mem, workspace->size); + stream = ZSTD_initStaticDStream(workspace->mem, workspace->size); if (!stream) { - pr_debug("BTRFS: ZSTD_initDStream failed\n"); + pr_debug("BTRFS: ZSTD_initStaticDStream failed\n"); ret = -EIO; goto done; } @@ -579,8 +588,8 @@ int zstd_decompress_bio(struct list_head *ws, struct compressed_bio *cb) ret2 = ZSTD_decompressStream(stream, >out_buf, >in_buf); if (ZSTD_isError(ret2)) { - pr_debug("BTRFS: ZSTD_decompressStream returned %d\n", - ZSTD_getErrorCode(ret2)); + pr_debug("BTRFS: ZSTD_decompressStream returned %s\n", + ZSTD_getErrorName(ret2)); ret = -EIO; goto done; } @@ -633,10 +642,9 @@ int zstd_decompress(struct list_head *ws, unsigned char *data_in, unsigned long pg_offset = 0; char *kaddr; -
[PATCH v2 0/9] Update to zstd-1.4.6
From: Nick Terrell This patchset upgrades the zstd library to the latest upstream release. The current zstd version in the kernel is a modified version of upstream zstd-1.3.1. At the time it was integrated, zstd wasn't ready to be used in the kernel as-is. But, it is now possible to use upstream zstd directly in the kernel. I have not yet release zstd-1.4.6 upstream. I want the zstd version in the kernel to match up with a known upstream release, so we know exactly what code is running. Whenever this patchset is ready for merge, I will cut a release at the upstream commit that gets merged. This should not be necessary for future releases. The kernel zstd library is automatically generated from upstream zstd. A script makes the necessary changes and imports it into the kernel. The changes are: 1. Replace all libc dependencies with kernel replacements and rewrite includes. 2. Remove unncessary portability macros like: #if defined(_MSC_VER). 3. Use the kernel xxhash instead of bundling it. This automation gets tested every commit by upstream's continuous integration. When we cut a new zstd release, we will submit a patch to the kernel to update the zstd version in the kernel. I've updated zstd to upstream with one big patch because every commit must build, so that precludes partial updates. Since the commit is 100% generated, I hope the review burden is lightened. I considered replaying upstream commits, but that is not possible because there have been ~3500 upstream commits since the last zstd import, and the commits don't all build individually. The bulk update preserves bisectablity because bugs can be bisected to the zstd version update. At that point the update can be reverted, and we can work with upstream to find and fix the bug. After this big switch in how the kernel consumes zstd, future patches will be smaller, because they will only have one upstream release worth of changes each. This patchset comes in 3 parts: 1. The first 2 patches prepare for the zstd upgrade. The first patch adds a compatibility wrapper so zstd can be upgraded without modifying any callers. The second patch adds an indirection for the lib/decompress_unzstd.c including of all decompression source files. 2. Import zstd-1.4.6. This patch is completely generated from upstream using automated tooling. 3. Update all callers to the zstd-1.4.6 API then delete the compatibility wrapper. I tested every caller of zstd on x86_64. I tested both after the 1.4.6 upgrade using the compatibility wrapper, and after the final patch in this series. I tested kernel and initramfs decompression in i386 and arm. I ran benchmarks to compare the current zstd in the kernel with zstd-1.4.6. I benchmarked on x86_64 using QEMU with KVM enabled on an Intel i9-9900k. I found: * BtrFS zstd compression at levels 1 and 3 is 5% faster * BtrFS zstd decompression+read is 15% faster * SquashFS zstd decompression+read is 15% faster * F2FS zstd compression+write at level 3 is 8% faster * F2FS zstd decompression+read is 20% faster * ZRAM decompression+read is 30% faster * Kernel zstd decompression is 35% faster * Initramfs zstd decompression+build is 5% faster The latest zstd also offers bug fixes and a 1 KB reduction in stack uage during compression. Please let me know if there is anything that I can do to ease the way for these patches. I think it is important because it gets large performance improvements, contains bug fixes, and is switching to a more maintainable model of consuming upstream zstd directly, making it easy to keep up to date. Best, Nick Terrell v1 -> v2: * Successfully tested F2FS with help from Chao Yu to fix my test. * (1/9) Fix ZSTD_initCStream() wrapper to handle pledged_src_size=0 means unknown. This fixes F2FS with the zstd-1.4.6 compatibility wrapper, exposed by the test. Nick Terrell (9): lib: zstd: Add zstd compatibility wrapper lib: zstd: Add decompress_sources.h for decompress_unzstd lib: zstd: Upgrade to latest upstream zstd version 1.4.6 crypto: zstd: Switch to zstd-1.4.6 API btrfs: zstd: Switch to the zstd-1.4.6 API f2fs: zstd: Switch to the zstd-1.4.6 API squashfs: zstd: Switch to the zstd-1.4.6 API lib: unzstd: Switch to the zstd-1.4.6 API lib: zstd: Remove zstd compatibility wrapper crypto/zstd.c | 22 +- fs/btrfs/zstd.c | 46 +- fs/f2fs/compress.c| 100 +- fs/squashfs/zstd_wrapper.c|7 +- include/linux/zstd.h | 3019 include/linux/zstd_errors.h | 76 + lib/decompress_unzstd.c | 44 +- lib/zstd/Makefile | 35 +- lib/zstd/bitstream.h | 379 -- lib/zstd/common/bitstream.h | 437 ++ lib/zstd/common/compiler.h| 134 + lib/zstd/common/cpu.h | 194 + lib/z
[PATCH v2 1/9] lib: zstd: Add zstd compatibility wrapper
From: Nick Terrell Adds zstd_compat.h which provides the necessary functions from the current zstd.h API. It is only active for zstd versions 1.4.6 and newer. That means it is disabled currently, but will become active when a later patch in this series updates the zstd library in the kernel to 1.4.6. This header allows the zstd upgrade to 1.4.6 without changing any callers, since they all include zstd through the compatibility wrapper. Later patches in this series transition each caller away from the compatibility wrapper. After all the callers have been transitioned away from the compatibility wrapper, the final patch in this series deletes it. Signed-off-by: Nick Terrell --- crypto/zstd.c | 2 +- fs/btrfs/zstd.c | 2 +- fs/f2fs/compress.c | 2 +- fs/squashfs/zstd_wrapper.c | 2 +- include/linux/zstd_compat.h | 116 lib/decompress_unzstd.c | 2 +- 6 files changed, 121 insertions(+), 5 deletions(-) create mode 100644 include/linux/zstd_compat.h diff --git a/crypto/zstd.c b/crypto/zstd.c index 1a3309f066f7..dcda3cad3b5c 100644 --- a/crypto/zstd.c +++ b/crypto/zstd.c @@ -11,7 +11,7 @@ #include #include #include -#include +#include #include diff --git a/fs/btrfs/zstd.c b/fs/btrfs/zstd.c index 9a4871636c6c..a7367ff573d4 100644 --- a/fs/btrfs/zstd.c +++ b/fs/btrfs/zstd.c @@ -16,7 +16,7 @@ #include #include #include -#include +#include #include "misc.h" #include "compression.h" #include "ctree.h" diff --git a/fs/f2fs/compress.c b/fs/f2fs/compress.c index 1dfb126a0cb2..e056f3a2b404 100644 --- a/fs/f2fs/compress.c +++ b/fs/f2fs/compress.c @@ -11,7 +11,7 @@ #include #include #include -#include +#include #include "f2fs.h" #include "node.h" diff --git a/fs/squashfs/zstd_wrapper.c b/fs/squashfs/zstd_wrapper.c index b7cb1faa652d..f8c512a6204e 100644 --- a/fs/squashfs/zstd_wrapper.c +++ b/fs/squashfs/zstd_wrapper.c @@ -11,7 +11,7 @@ #include #include #include -#include +#include #include #include "squashfs_fs.h" diff --git a/include/linux/zstd_compat.h b/include/linux/zstd_compat.h new file mode 100644 index ..cda9208bf04a --- /dev/null +++ b/include/linux/zstd_compat.h @@ -0,0 +1,116 @@ +/* + * Copyright (c) 2016-present, Facebook, Inc. + * All rights reserved. + * + * This source code is licensed under the BSD-style license found in the + * LICENSE file in the root directory of https://github.com/facebook/zstd. + * An additional grant of patent rights can be found in the PATENTS file in the + * same directory. + * + * This program is free software; you can redistribute it and/or modify it under + * the terms of the GNU General Public License version 2 as published by the + * Free Software Foundation. This program is dual-licensed; you may select + * either version 2 of the GNU General Public License ("GPL") or BSD license + * ("BSD"). + */ + +#ifndef ZSTD_COMPAT_H +#define ZSTD_COMPAT_H + +#include + +#if defined(ZSTD_VERSION_NUMBER) && (ZSTD_VERSION_NUMBER >= 10406) +/* + * This header provides backwards compatibility for the zstd-1.4.6 library + * upgrade. This header allows us to upgrade the zstd library version without + * modifying any callers. Then we will migrate callers from the compatibility + * wrapper one at a time until none remain. At which point we will delete this + * header. + * + * It is temporary and will be deleted once the upgrade is complete. + */ + +#include + +static inline size_t ZSTD_CCtxWorkspaceBound(ZSTD_compressionParameters compression_params) +{ +return ZSTD_estimateCCtxSize_usingCParams(compression_params); +} + +static inline size_t ZSTD_CStreamWorkspaceBound(ZSTD_compressionParameters compression_params) +{ +return ZSTD_estimateCStreamSize_usingCParams(compression_params); +} + +static inline size_t ZSTD_DCtxWorkspaceBound(void) +{ +return ZSTD_estimateDCtxSize(); +} + +static inline size_t ZSTD_DStreamWorkspaceBound(unsigned long long window_size) +{ +return ZSTD_estimateDStreamSize(window_size); +} + +static inline ZSTD_CCtx* ZSTD_initCCtx(void* wksp, size_t wksp_size) +{ +if (wksp == NULL) +return NULL; +return ZSTD_initStaticCCtx(wksp, wksp_size); +} + +static inline ZSTD_CStream* ZSTD_initCStream_compat(ZSTD_parameters params, uint64_t pledged_src_size, void* wksp, size_t wksp_size) +{ +ZSTD_CStream* cstream; +size_t ret; + +if (wksp == NULL) +return NULL; + +cstream = ZSTD_initStaticCStream(wksp, wksp_size); +if (cstream == NULL) +return NULL; + +/* 0 means unknown in old API but means 0 in new API */ +if (pledged_src_size == 0) +pledged_src_size = ZSTD_CONTENTSIZE_UNKNOWN; + +ret = ZSTD_initCStream_advanced(cstream, NULL, 0, params, pledged_src_size); +if (ZSTD_isError(ret)) +return NULL; + +return cstream; +} +#define ZS
[PATCH v2 2/9] lib: zstd: Add decompress_sources.h for decompress_unzstd
From: Nick Terrell Adds decompress_sources.h which includes every .c file necessary for zstd decompression. This is used in decompress_unzstd.c so the internal structure of the library isn't exposed. This allows us to upgrade the zstd library version without modifying any callers. Instead we just need to update decompress_sources.h. Signed-off-by: Nick Terrell --- lib/decompress_unzstd.c | 6 +- lib/zstd/decompress_sources.h | 14 ++ 2 files changed, 15 insertions(+), 5 deletions(-) create mode 100644 lib/zstd/decompress_sources.h diff --git a/lib/decompress_unzstd.c b/lib/decompress_unzstd.c index dbc290af26b4..a79f705f236d 100644 --- a/lib/decompress_unzstd.c +++ b/lib/decompress_unzstd.c @@ -68,11 +68,7 @@ #ifdef STATIC # define UNZSTD_PREBOOT # include "xxhash.c" -# include "zstd/entropy_common.c" -# include "zstd/fse_decompress.c" -# include "zstd/huf_decompress.c" -# include "zstd/zstd_common.c" -# include "zstd/decompress.c" +# include "zstd/decompress_sources.h" #endif #include diff --git a/lib/zstd/decompress_sources.h b/lib/zstd/decompress_sources.h new file mode 100644 index ..ccb4960ea0cd --- /dev/null +++ b/lib/zstd/decompress_sources.h @@ -0,0 +1,14 @@ +// SPDX-License-Identifier: GPL-2.0-only + +/* + * This file includes every .c file needed for decompression. + * It is used by lib/decompress_unzstd.c to include the decompression + * source into the translation-unit, so it can be used for kernel + * decompression. + */ + +#include "entropy_common.c" +#include "fse_decompress.c" +#include "huf_decompress.c" +#include "zstd_common.c" +#include "decompress.c" -- 2.28.0
Re: [PATCH 6/9] f2fs: zstd: Switch to the zstd-1.4.6 API
> On Sep 17, 2020, at 6:47 PM, Chao Yu wrote: > > On 2020/9/18 3:34, Nick Terrell wrote: >>> On Sep 17, 2020, at 11:00 AM, Nick Terrell wrote: >>> >>> >>> >>>> On Sep 16, 2020, at 11:31 PM, Chao Yu wrote: >>>> >>>> Hi Nick, >>>> >>>> On 2020/9/17 2:39, Nick Terrell wrote: >>>>>> On Sep 15, 2020, at 11:31 PM, Chao Yu wrote: >>>>>> >>>>>> Hi Nick, >>>>>> >>>>>> remove not related mailing list. >>>>>> >>>>>> On 2020/9/16 11:43, Nick Terrell wrote: >>>>>>> From: Nick Terrell >>>>>>> Move away from the compatibility wrapper to the zstd-1.4.6 API. This >>>>>>> code is more efficient because it uses the single-pass API instead of >>>>>>> the streaming API. The streaming API is not necessary because the whole >>>>>>> input and output buffers are available. This saves memory because we >>>>>>> don't need to allocate a buffer for the window. It is also more >>>>>>> efficient because it saves unnecessary memcpy calls. >>>>>>> I've had problems testing this code because I see data truncation before >>>>>>> and after this patchset. Help testing this patch would be much >>>>>>> appreciated. >>>>>> >>>>>> Can you please explain more about data truncation? I'm a little >>>>>> confused... >>>>>> >>>>>> Do you mean that f2fs doesn't allocate enough memory for zstd >>>>>> compression, >>>>>> so that compression is not finished actually, the compressed data is >>>>>> truncated >>>>>> at dst buffer? >>>>> Hi Chao, >>>>> I’ve tested F2FS using a benchmark I adapted from testing BtrFS [0]. It >>>>> is possible >>>>> that the script I’m using is buggy or is exposing an edge case in F2FS. >>>>> The files >>>>> that I copy to F2FS and compress end up truncated with a hole at the end. >>>> >>>> Thanks for your explanation. :) >>>> >>>>> It is based off of upstream commit ab29a807a7. >>>>> E.g. the end of the copied file looks like this, but the original file >>>>> has non-zero data >>>>> In the end. Until the hole at the end the file is correct. >>>>> od dickens | tail -n 5 >>>>>> 46667760 067502 066167 020056 040440 020163 023511 006555 060412 >>>>>> 4667 00 00 00 00 00 00 00 00 >>>>>> * >>>>>> 46703060 00 00 00 00 00 00 00 >>>>>> 46703076 >>>>> [0] https://gist.github.com/terrelln/7dd2919937dfbdb8e839e4ad11c81db4 >>>> >>>> Shouldn't we just get sha1 value by flitering sha1sum output? >>>> >>>> asha=`sha1sum $BENCHMARK_DIR/$file |awk {'print $1'}` >>>> bsha=`sha1sum $MP/$i/$file |awk {'print $1'}` >>> >>> Probably, but it was just a quick one-off script. >> Ah, never mind, you are right. >>>> I can't reproduce this issue by using simple data sample, could you share >>>> that 'dickens' file or other smaller-sized sample if you have? >>> >>> The /tmp/silesia directory in the example is populated with all the files >>> from >>> this website. It is a popular data compression benchmark corpus. You can >>> click on the “total” link to download a zip archive of all the files. >>> >>> https://urldefense.proofpoint.com/v2/url?u=http-3A__sun.aei.polsl.pl_-7Esdeor_index.php-3Fpage-3Dsilesia=DwIDaQ=5VD0RTtNlTh3ycd41b3MUw=HQM5IQdWOB8WaMoii2dYTw=-bYa7TavRodl96xy65hjVIkt5HdMldv4LOCRHJf12n8=mdX82rCzyHO-Q3KGJ5b94mqDKcDh1IWEqEWfuqw7P3I= >>> >>> -Nick >> I’ve spent some time minimizing the test case. This script [0] is the >> minimized >> test case that doesn’t require any input files, it builds its own. >> Several observations: >> * The input file needs to be 7700481 bytes large, smaller files don’t >> trigger the bug. >> * You have to `chattr +c` the file after copying it otherwise the bug >> doesn’t occur. >> * After `chattr +c` you have to unmount and remount the filesystem to >> trigger the bug. >> I’ve reproduced on v5.9-rc5 (856deb866d16e). I’ve also reproduced on m
Re: [PATCH 6/9] f2fs: zstd: Switch to the zstd-1.4.6 API
> On Sep 17, 2020, at 11:00 AM, Nick Terrell wrote: > > > >> On Sep 16, 2020, at 11:31 PM, Chao Yu wrote: >> >> Hi Nick, >> >> On 2020/9/17 2:39, Nick Terrell wrote: >>>> On Sep 15, 2020, at 11:31 PM, Chao Yu wrote: >>>> >>>> Hi Nick, >>>> >>>> remove not related mailing list. >>>> >>>> On 2020/9/16 11:43, Nick Terrell wrote: >>>>> From: Nick Terrell >>>>> Move away from the compatibility wrapper to the zstd-1.4.6 API. This >>>>> code is more efficient because it uses the single-pass API instead of >>>>> the streaming API. The streaming API is not necessary because the whole >>>>> input and output buffers are available. This saves memory because we >>>>> don't need to allocate a buffer for the window. It is also more >>>>> efficient because it saves unnecessary memcpy calls. >>>>> I've had problems testing this code because I see data truncation before >>>>> and after this patchset. Help testing this patch would be much >>>>> appreciated. >>>> >>>> Can you please explain more about data truncation? I'm a little confused... >>>> >>>> Do you mean that f2fs doesn't allocate enough memory for zstd compression, >>>> so that compression is not finished actually, the compressed data is >>>> truncated >>>> at dst buffer? >>> Hi Chao, >>> I’ve tested F2FS using a benchmark I adapted from testing BtrFS [0]. It is >>> possible >>> that the script I’m using is buggy or is exposing an edge case in F2FS. The >>> files >>> that I copy to F2FS and compress end up truncated with a hole at the end. >> >> Thanks for your explanation. :) >> >>> It is based off of upstream commit ab29a807a7. >>> E.g. the end of the copied file looks like this, but the original file has >>> non-zero data >>> In the end. Until the hole at the end the file is correct. >>> od dickens | tail -n 5 >>>> 46667760 067502 066167 020056 040440 020163 023511 006555 060412 >>>> 4667 00 00 00 00 00 00 00 00 >>>> * >>>> 46703060 00 00 00 00 00 00 00 >>>> 46703076 >>> [0] https://gist.github.com/terrelln/7dd2919937dfbdb8e839e4ad11c81db4 >> >> Shouldn't we just get sha1 value by flitering sha1sum output? >> >> asha=`sha1sum $BENCHMARK_DIR/$file |awk {'print $1'}` >> bsha=`sha1sum $MP/$i/$file |awk {'print $1'}` > > Probably, but it was just a quick one-off script. Ah, never mind, you are right. >> I can't reproduce this issue by using simple data sample, could you share >> that 'dickens' file or other smaller-sized sample if you have? > > The /tmp/silesia directory in the example is populated with all the files from > this website. It is a popular data compression benchmark corpus. You can > click on the “total” link to download a zip archive of all the files. > > http://sun.aei.polsl.pl/~sdeor/index.php?page=silesia > > -Nick I’ve spent some time minimizing the test case. This script [0] is the minimized test case that doesn’t require any input files, it builds its own. Several observations: * The input file needs to be 7700481 bytes large, smaller files don’t trigger the bug. * You have to `chattr +c` the file after copying it otherwise the bug doesn’t occur. * After `chattr +c` you have to unmount and remount the filesystem to trigger the bug. I’ve reproduced on v5.9-rc5 (856deb866d16e). I’ve also reproduced on my host machine running 5.8.5-arch1-1. [0] https://gist.github.com/terrelln/4bba325abdfa3a6f014e9911ac92a185 Best, Nick >> Thanks, >> >>> Best, >>> Nick >>>> Thanks, >>>> >>>>> Signed-off-by: Nick Terrell >>>>> --- >>>>> fs/f2fs/compress.c | 102 + >>>>> 1 file changed, 38 insertions(+), 64 deletions(-) >>>>> diff --git a/fs/f2fs/compress.c b/fs/f2fs/compress.c >>>>> index e056f3a2b404..b79efce81651 100644 >>>>> --- a/fs/f2fs/compress.c >>>>> +++ b/fs/f2fs/compress.c >>>>> @@ -11,7 +11,8 @@ >>>>> #include >>>>> #include >>>>> #include >>>>> -#include >>>>> +#include >>>>> +#include >>>>> #include "f2fs.h" >>>>> #include "node.h" >&g
Re: [PATCH 6/9] f2fs: zstd: Switch to the zstd-1.4.6 API
> On Sep 16, 2020, at 11:31 PM, Chao Yu wrote: > > Hi Nick, > > On 2020/9/17 2:39, Nick Terrell wrote: >>> On Sep 15, 2020, at 11:31 PM, Chao Yu wrote: >>> >>> Hi Nick, >>> >>> remove not related mailing list. >>> >>> On 2020/9/16 11:43, Nick Terrell wrote: >>>> From: Nick Terrell >>>> Move away from the compatibility wrapper to the zstd-1.4.6 API. This >>>> code is more efficient because it uses the single-pass API instead of >>>> the streaming API. The streaming API is not necessary because the whole >>>> input and output buffers are available. This saves memory because we >>>> don't need to allocate a buffer for the window. It is also more >>>> efficient because it saves unnecessary memcpy calls. >>>> I've had problems testing this code because I see data truncation before >>>> and after this patchset. Help testing this patch would be much >>>> appreciated. >>> >>> Can you please explain more about data truncation? I'm a little confused... >>> >>> Do you mean that f2fs doesn't allocate enough memory for zstd compression, >>> so that compression is not finished actually, the compressed data is >>> truncated >>> at dst buffer? >> Hi Chao, >> I’ve tested F2FS using a benchmark I adapted from testing BtrFS [0]. It is >> possible >> that the script I’m using is buggy or is exposing an edge case in F2FS. The >> files >> that I copy to F2FS and compress end up truncated with a hole at the end. > > Thanks for your explanation. :) > >> It is based off of upstream commit ab29a807a7. >> E.g. the end of the copied file looks like this, but the original file has >> non-zero data >> In the end. Until the hole at the end the file is correct. >> od dickens | tail -n 5 >>> 46667760 067502 066167 020056 040440 020163 023511 006555 060412 >>> 4667 00 00 00 00 00 00 00 00 >>> * >>> 46703060 00 00 00 00 00 00 00 >>> 46703076 >> [0] https://gist.github.com/terrelln/7dd2919937dfbdb8e839e4ad11c81db4 > > Shouldn't we just get sha1 value by flitering sha1sum output? > >asha=`sha1sum $BENCHMARK_DIR/$file |awk {'print $1'}` >bsha=`sha1sum $MP/$i/$file |awk {'print $1'}` Probably, but it was just a quick one-off script. > I can't reproduce this issue by using simple data sample, could you share > that 'dickens' file or other smaller-sized sample if you have? The /tmp/silesia directory in the example is populated with all the files from this website. It is a popular data compression benchmark corpus. You can click on the “total” link to download a zip archive of all the files. http://sun.aei.polsl.pl/~sdeor/index.php?page=silesia -Nick > Thanks, > >> Best, >> Nick >>> Thanks, >>> >>>> Signed-off-by: Nick Terrell >>>> --- >>>> fs/f2fs/compress.c | 102 + >>>> 1 file changed, 38 insertions(+), 64 deletions(-) >>>> diff --git a/fs/f2fs/compress.c b/fs/f2fs/compress.c >>>> index e056f3a2b404..b79efce81651 100644 >>>> --- a/fs/f2fs/compress.c >>>> +++ b/fs/f2fs/compress.c >>>> @@ -11,7 +11,8 @@ >>>> #include >>>> #include >>>> #include >>>> -#include >>>> +#include >>>> +#include >>>>#include "f2fs.h" >>>> #include "node.h" >>>> @@ -298,21 +299,21 @@ static const struct f2fs_compress_ops f2fs_lz4_ops = >>>> { >>>> static int zstd_init_compress_ctx(struct compress_ctx *cc) >>>> { >>>>ZSTD_parameters params; >>>> - ZSTD_CStream *stream; >>>> + ZSTD_CCtx *ctx; >>>>void *workspace; >>>>unsigned int workspace_size; >>>>params = ZSTD_getParams(F2FS_ZSTD_DEFAULT_CLEVEL, cc->rlen, 0); >>>> - workspace_size = ZSTD_CStreamWorkspaceBound(params.cParams); >>>> + workspace_size = ZSTD_estimateCCtxSize_usingCParams(params.cParams); >>>>workspace = f2fs_kvmalloc(F2FS_I_SB(cc->inode), >>>>workspace_size, GFP_NOFS); >>>>if (!workspace) >>>>return -ENOMEM; >>>> - stream = ZSTD_initCStream(params, 0, workspace, workspace_size); >>>> - if (!stream) { >>>> - printk_ratelimited("%sF2
Re: [PATCH 5/9] btrfs: zstd: Switch to the zstd-1.4.6 API
> On Sep 17, 2020, at 7:28 AM, Chris Mason wrote: > > On 17 Sep 2020, at 6:04, Christoph Hellwig wrote: > >> On Wed, Sep 16, 2020 at 09:35:51PM -0400, Rik van Riel wrote: One possibility is to have a kernel wrapper on top of the zstd API to make it more ergonomic. I personally don???t really see the value in it, since it adds another layer of indirection between zstd and the caller, but it could be done. >>> >>> Zstd would not be the first part of the kernel to >>> come from somewhere else, and have wrappers when >>> it gets integrated into the kernel. There certainly >>> is precedence there. >>> >>> It would be interesting to know what Christoph's >>> preference is. >> >> Yes, I think kernel wrappers would be a pretty sensible step forward. >> That also avoid the need to do strange upgrades to a new version, >> and instead we can just change APIs on a as-needed basis. > > When we add wrappers, we end up creating a kernel specific API that doesn’t > match the upstream zstd docs, and it doesn’t leverage as much of the zstd > fuzzing and testing. > > So we’re actually making kernel zstd slightly less usable in hopes that our > kernel specific part of the API is familiar enough to us that it makes zstd > more usable. There’s no way to compare the two until the wrappers are done, > but given the code today I’d prefer that we focus on making it really easy to > track upstream. I really understand Christoph’s side here, but I’d rather > ride a camel with the group than go it alone. > > I’d also much rather spend time on any problems where the structure of the > zstd APIs don’t fit the kernel’s needs. The btrfs streaming > compression/decompression looks pretty clean to me, but I think Johannes > mentioned some possibilities to improve things for zswap (optimizations for > page-at-atime). If there are places where the zstd memory management or > error handling don’t fit naturally into the kernel, that would also be higher > on my list. This update includes the recent optimizations for ZSwap that I've made, which gives a 30% speed boost for page-at-a-time decompression. We're very open to improving and changing zstd to better fit the needs of the kernel. If there are use cases that can't use the existing API, or the existing API isn't optimal, or any other problems, we’re happy to help figure out the best solution. Opening an issue on our upstream GitHub repo is the best way to get our attention -Nick > Fixing those are probably going to be much easier if we’re close to the zstd > upstream, again so that we can leverage testing and long term code > maintenance done there. > > -chris
Re: [PATCH 1/9] lib: zstd: Add zstd compatibility wrapper
> On Sep 16, 2020, at 1:48 AM, Christoph Hellwig wrote: > > On Tue, Sep 15, 2020 at 08:42:54PM -0700, Nick Terrell wrote: >> From: Nick Terrell >> >> Adds zstd_compat.h which provides the necessary functions from the >> current zstd.h API. It is only active for zstd versions 1.4.6 and newer. >> That means it is disabled currently, but will become active when a later >> patch in this series updates the zstd library in the kernel to 1.4.6. >> >> This header allows the zstd upgrade to 1.4.6 without changing any >> callers, since they all include zstd through the compatibility wrapper. >> Later patches in this series transition each caller away from the >> compatibility wrapper. After all the callers have been transitioned away >> from the compatibility wrapper, the final patch in this series deletes >> it. > > Please just add wrappes to the main header instead of causing all > this churn. The goal of having it in a separate header is so the 3rd patch that actually updates zstd can be 100% automatically generated. I didn’t want to mix a small amount of edits into a large generated patch, because that would be easy to miss.
Re: [PATCH 5/9] btrfs: zstd: Switch to the zstd-1.4.6 API
> On Sep 16, 2020, at 7:46 AM, Christoph Hellwig wrote: > > On Wed, Sep 16, 2020 at 10:43:04AM -0400, Chris Mason wrote: >> Otherwise we just end up with drift and kernel-specific bugs that are harder >> to debug. To the extent those APIs make us contort the kernel code, I???m >> sure Nick is interested in improving things in both places. > > Seriously, we do not care elsewhere. Why would zlib be any different? > >> There are probably 1000 constructive ways to have that conversation. Please >> choose one of those instead of being an asshole. > > I think you are the asshole here by ignoring the practices we are using > elsewhere and think your employers pet project is somehow special. It > is not, and claiming so is everything but constructive. My goal in updating the zstd kernel to use the upstream API directly is to make frequent syncs into the kernel easy. This is important so the kernel doesn't miss out on bug fixes and performance improvements. The upstream zstd is continuously fuzzed and is battle tested in production and across many different projects external to Facebook. That means that zstd-1.4.6 has an additional 3 years of continuous fuzzing, as well as improvements to our fuzz and test suite. The zstd version in the kernel works fine. But, you can see that the version that got imported stagnated where upstream had 14 released versions. I don't think it makes sense to have kernel developers maintain their own copy of zstd. Their time would be better spent working on the rest of the kernel. Using upstream directly lets the kernel profit from the work that we, the zstd developers, are doing. And it still allows kernel developers to fix bugs if any show up, and we can back-port them to upstream. For example, I’ve measured that BtrFS decompression + read performance is improved 15% with this patch. And ZRAM performance improves 30%. And SquashFS decompression + read performance improves 15%. Admittedly, the API provided for static workspace allocation is verbose. Most zstd users don’t need it, so our efforts to improve the ergonomics of the API haven’t been focused here. At this point, we couldn’t rename these APIs easily, since we have users relying on our API. It could be done, because we don’t guarantee ABI stability for this portion of the API, but we would have to have a good reason for it. One possibility is to have a kernel wrapper on top of the zstd API to make it more ergonomic. I personally don’t really see the value in it, since it adds another layer of indirection between zstd and the caller, but it could be done. Of all the compressors in the kernel, only lz4 and zstd are under active development. And lz4 has switched to using the upstream API directly. Xz does see a little bit of development, but nothing has been synced to the kernel. Best, Nick
Re: [PATCH 3/9] lib: zstd: Upgrade to latest upstream zstd version 1.4.6
> On Sep 15, 2020, at 8:42 PM, Nick Terrell wrote: > > From: Nick Terrell > > Upgrade to the latest upstream zstd version 1.4.6. > > This patch is 100% generated from upstream zstd commit c4763f087c2b [0]. > > This patch is very large because it is transitioning from the custom > kernel zstd to using upstream directly. The new zstd follows upstreams > file structure which is different. Future update patches will be much > smaller because they will only contain the changes from one upstream > zstd release. > > The benefits of this patch are as follows: > 1. Using upstream directly with automated script to generate kernel > code. This allows us to update the kernel every upstream release, so > the kernel gets the latest bug fixes and performance improvements, > and doesn't get 3 years out of date again. The automation and the > translated code are tested every upstream commit to ensure it > continues to work. > 2. Upgrades from a custom zstd based on 1.3.1 to 1.4.6, getting 3 years > of performance improvements and bug fixes. On x86_64 I've measured > 15% faster BtrFS and SquashFS decompression+read speeds, 35% faster > kernel decompression, and 30% faster ZRAM decompression+read speeds. > Additionally, the latest zstd uses ~1 KB less stack space for > compression. > 3. Switches to using the upstream API directly. It is slightly less > ergonomic for the kernel use case, where malloc/free aren't provided. > But, it means that users don't need to familiarize themselves with 2 > zstd APIs. > > I chose the bulk update instead of replaying upstream commits because > there have been ~3500 upstream commits since the 1.3.1 release, zstd > wasn't ready to be used in the kernel as-is before a month ago, and not > all upstream zstd commits build. The bulk update preserves bisectablity > because bugs can be bisected to the zstd version update. At that point > the update can be reverted, and we can work with upstream to find and > fix the bug. > > Note that upstream zstd release 1.4.6 doesn't exist yet. I have cut a > staging branch at c4763f087c2b [0] and will apply any changes requested > to the staging branch. Once we're ready to merge this update I will cut > a zstd release at the commit we merge, so we have a known zstd release > in the kernel. > > [0] > https://github.com/facebook/zstd/commit/c4763f087c2b4b5857a8323ff3360b240db23786 > > Signed-off-by: Nick Terrell Below is a diff that shows the difference between upstream zstd imported directly into the kernel, and the version in this patch that uses upstreams automation generate a working zstd. I hope it is helpful for review, since I know the full patch is way to large for a meaningful review. The automation does several necessary things: * Rewrite libc headers * Replace bundled xxhash with kernel xxhash * Provide zstd_deps.h, which holds all of zstd’s libc dependencies It also hardwires certain preprocessor macros to avoid unnecessary portability code in the kernel. This is not strictly necessary, because these macros could be defined at compile time. See [0] for a list of macros. This diff is also available at [0]. [0] https://gist.github.com/terrelln/5a266ef4f6ee8bc60dde192daaaf2c97 [1] https://github.com/facebook/zstd/blob/d96e98cfde66e9e20dcadcfd9ed3b82ba648adfe/contrib/linux-kernel/Makefile#L17 Best, Nick --- include/linux/zstd.h | 28 +--- include/linux/zstd_errors.h | 24 +-- lib/zstd/common/bitstream.h | 28 +--- lib/zstd/common/compiler.h| 91 ++- lib/zstd/common/cpu.h | 21 +-- lib/zstd/common/debug.h | 6 - lib/zstd/common/entropy_common.c | 7 +- lib/zstd/common/error_private.h | 18 +-- lib/zstd/common/fse.h | 8 +- lib/zstd/common/fse_decompress.c | 13 -- lib/zstd/common/huf.h | 8 +- lib/zstd/common/mem.h | 77 +- lib/zstd/common/zstd_deps.h | 110 ++--- lib/zstd/common/zstd_internal.h | 35 + lib/zstd/compress/fse_compress.c | 80 -- lib/zstd/compress/hist.c | 16 -- lib/zstd/compress/huf_compress.c | 35 - lib/zstd/compress/zstd_compress.c | 135 +--- lib/zstd/compress/zstd_compress_internal.h| 42 + lib/zstd/compress/zstd_compress_superblock.h | 2 +- lib/zstd/compress/zstd_cwksp.h| 6 - lib/zstd/compress/zstd_double_fast.h | 6 - lib/zstd/compress/zstd_fast.h | 6 - lib/zstd/compress/zstd_lazy.c | 4 +- lib/zstd/compress/zstd_lazy.h | 6 - lib/zstd/com
Re: [PATCH 6/9] f2fs: zstd: Switch to the zstd-1.4.6 API
> On Sep 15, 2020, at 11:31 PM, Chao Yu wrote: > > Hi Nick, > > remove not related mailing list. > > On 2020/9/16 11:43, Nick Terrell wrote: >> From: Nick Terrell >> Move away from the compatibility wrapper to the zstd-1.4.6 API. This >> code is more efficient because it uses the single-pass API instead of >> the streaming API. The streaming API is not necessary because the whole >> input and output buffers are available. This saves memory because we >> don't need to allocate a buffer for the window. It is also more >> efficient because it saves unnecessary memcpy calls. >> I've had problems testing this code because I see data truncation before >> and after this patchset. Help testing this patch would be much >> appreciated. > > Can you please explain more about data truncation? I'm a little confused... > > Do you mean that f2fs doesn't allocate enough memory for zstd compression, > so that compression is not finished actually, the compressed data is truncated > at dst buffer? Hi Chao, I’ve tested F2FS using a benchmark I adapted from testing BtrFS [0]. It is possible that the script I’m using is buggy or is exposing an edge case in F2FS. The files that I copy to F2FS and compress end up truncated with a hole at the end. It is based off of upstream commit ab29a807a7. E.g. the end of the copied file looks like this, but the original file has non-zero data In the end. Until the hole at the end the file is correct. od dickens | tail -n 5 > 46667760 067502 066167 020056 040440 020163 023511 006555 060412 > 4667 00 00 00 00 00 00 00 00 > * > 46703060 00 00 00 00 00 00 00 > 46703076 [0] https://gist.github.com/terrelln/7dd2919937dfbdb8e839e4ad11c81db4 Best, Nick > Thanks, > >> Signed-off-by: Nick Terrell >> --- >> fs/f2fs/compress.c | 102 + >> 1 file changed, 38 insertions(+), 64 deletions(-) >> diff --git a/fs/f2fs/compress.c b/fs/f2fs/compress.c >> index e056f3a2b404..b79efce81651 100644 >> --- a/fs/f2fs/compress.c >> +++ b/fs/f2fs/compress.c >> @@ -11,7 +11,8 @@ >> #include >> #include >> #include >> -#include >> +#include >> +#include >>#include "f2fs.h" >> #include "node.h" >> @@ -298,21 +299,21 @@ static const struct f2fs_compress_ops f2fs_lz4_ops = { >> static int zstd_init_compress_ctx(struct compress_ctx *cc) >> { >> ZSTD_parameters params; >> -ZSTD_CStream *stream; >> +ZSTD_CCtx *ctx; >> void *workspace; >> unsigned int workspace_size; >> params = ZSTD_getParams(F2FS_ZSTD_DEFAULT_CLEVEL, cc->rlen, 0); >> -workspace_size = ZSTD_CStreamWorkspaceBound(params.cParams); >> +workspace_size = ZSTD_estimateCCtxSize_usingCParams(params.cParams); >> workspace = f2fs_kvmalloc(F2FS_I_SB(cc->inode), >> workspace_size, GFP_NOFS); >> if (!workspace) >> return -ENOMEM; >> - stream = ZSTD_initCStream(params, 0, workspace, workspace_size); >> -if (!stream) { >> -printk_ratelimited("%sF2FS-fs (%s): %s ZSTD_initCStream >> failed\n", >> +ctx = ZSTD_initStaticCCtx(workspace, workspace_size); >> +if (!ctx) { >> +printk_ratelimited("%sF2FS-fs (%s): %s ZSTD_inittaticCStream >> failed\n", >> KERN_ERR, F2FS_I_SB(cc->inode)->sb->s_id, >> __func__); >> kvfree(workspace); >> @@ -320,7 +321,7 @@ static int zstd_init_compress_ctx(struct compress_ctx >> *cc) >> } >> cc->private = workspace; >> -cc->private2 = stream; >> +cc->private2 = ctx; >> cc->clen = cc->rlen - PAGE_SIZE - COMPRESS_HEADER_SIZE; >> return 0; >> @@ -335,65 +336,48 @@ static void zstd_destroy_compress_ctx(struct >> compress_ctx *cc) >>static int zstd_compress_pages(struct compress_ctx *cc) >> { >> -ZSTD_CStream *stream = cc->private2; >> -ZSTD_inBuffer inbuf; >> -ZSTD_outBuffer outbuf; >> -int src_size = cc->rlen; >> -int dst_size = src_size - PAGE_SIZE - COMPRESS_HEADER_SIZE; >> -int ret; >> - >> -inbuf.pos = 0; >> -inbuf.src = cc->rbuf; >> -inbuf.size = src_size; >> - >> -outbuf.pos = 0; >> -outbuf.dst = cc->cbuf->cdata; >> -outbuf.size = dst_size; >> - >> -ret = ZSTD_compressStream(stream, , ); >> -
[PATCH 4/9] crypto: zstd: Switch to zstd-1.4.6 API
From: Nick Terrell Move away from the compatibility wrapper to the zstd-1.4.6 API. This code is functionally equivalent. Signed-off-by: Nick Terrell --- crypto/zstd.c | 24 +++- 1 file changed, 11 insertions(+), 13 deletions(-) diff --git a/crypto/zstd.c b/crypto/zstd.c index dcda3cad3b5c..767fe2fbe009 100644 --- a/crypto/zstd.c +++ b/crypto/zstd.c @@ -11,7 +11,7 @@ #include #include #include -#include +#include #include @@ -24,16 +24,15 @@ struct zstd_ctx { void *dwksp; }; -static ZSTD_parameters zstd_params(void) -{ - return ZSTD_getParams(ZSTD_DEF_LEVEL, 0, 0); -} - static int zstd_comp_init(struct zstd_ctx *ctx) { int ret = 0; - const ZSTD_parameters params = zstd_params(); - const size_t wksp_size = ZSTD_CCtxWorkspaceBound(params.cParams); + const size_t wksp_size = ZSTD_estimateCCtxSize(ZSTD_DEF_LEVEL); + + if (ZSTD_isError(wksp_size)) { + ret = -EINVAL; + goto out_free; + } ctx->cwksp = vzalloc(wksp_size); if (!ctx->cwksp) { @@ -41,7 +40,7 @@ static int zstd_comp_init(struct zstd_ctx *ctx) goto out; } - ctx->cctx = ZSTD_initCCtx(ctx->cwksp, wksp_size); + ctx->cctx = ZSTD_initStaticCCtx(ctx->cwksp, wksp_size); if (!ctx->cctx) { ret = -EINVAL; goto out_free; @@ -56,7 +55,7 @@ static int zstd_comp_init(struct zstd_ctx *ctx) static int zstd_decomp_init(struct zstd_ctx *ctx) { int ret = 0; - const size_t wksp_size = ZSTD_DCtxWorkspaceBound(); + const size_t wksp_size = ZSTD_estimateDCtxSize(); ctx->dwksp = vzalloc(wksp_size); if (!ctx->dwksp) { @@ -64,7 +63,7 @@ static int zstd_decomp_init(struct zstd_ctx *ctx) goto out; } - ctx->dctx = ZSTD_initDCtx(ctx->dwksp, wksp_size); + ctx->dctx = ZSTD_initStaticDCtx(ctx->dwksp, wksp_size); if (!ctx->dctx) { ret = -EINVAL; goto out_free; @@ -152,9 +151,8 @@ static int __zstd_compress(const u8 *src, unsigned int slen, { size_t out_len; struct zstd_ctx *zctx = ctx; - const ZSTD_parameters params = zstd_params(); - out_len = ZSTD_compressCCtx(zctx->cctx, dst, *dlen, src, slen, params); + out_len = ZSTD_compressCCtx(zctx->cctx, dst, *dlen, src, slen, ZSTD_DEF_LEVEL); if (ZSTD_isError(out_len)) return -EINVAL; *dlen = out_len; -- 2.28.0
[PATCH 7/9] squashfs: zstd: Switch to the zstd-1.4.6 API
From: Nick Terrell Move away from the compatibility wrapper to the zstd-1.4.6 API. This code is functionally equivalent. Signed-off-by: Nick Terrell --- fs/squashfs/zstd_wrapper.c | 9 - 1 file changed, 4 insertions(+), 5 deletions(-) diff --git a/fs/squashfs/zstd_wrapper.c b/fs/squashfs/zstd_wrapper.c index f8c512a6204e..add582409866 100644 --- a/fs/squashfs/zstd_wrapper.c +++ b/fs/squashfs/zstd_wrapper.c @@ -11,7 +11,7 @@ #include #include #include -#include +#include #include #include "squashfs_fs.h" @@ -34,7 +34,7 @@ static void *zstd_init(struct squashfs_sb_info *msblk, void *buff) goto failed; wksp->window_size = max_t(size_t, msblk->block_size, SQUASHFS_METADATA_SIZE); - wksp->mem_size = ZSTD_DStreamWorkspaceBound(wksp->window_size); + wksp->mem_size = ZSTD_estimateDStreamSize(wksp->window_size); wksp->mem = vmalloc(wksp->mem_size); if (wksp->mem == NULL) goto failed; @@ -71,7 +71,7 @@ static int zstd_uncompress(struct squashfs_sb_info *msblk, void *strm, struct bvec_iter_all iter_all = {}; struct bio_vec *bvec = bvec_init_iter_all(_all); - stream = ZSTD_initDStream(wksp->window_size, wksp->mem, wksp->mem_size); + stream = ZSTD_initStaticDStream(wksp->mem, wksp->mem_size); if (!stream) { ERROR("Failed to initialize zstd decompressor\n"); @@ -122,8 +122,7 @@ static int zstd_uncompress(struct squashfs_sb_info *msblk, void *strm, break; if (ZSTD_isError(zstd_err)) { - ERROR("zstd decompression error: %d\n", - (int)ZSTD_getErrorCode(zstd_err)); + ERROR("zstd decompression error: %s\n", ZSTD_getErrorName(zstd_err)); error = -EIO; break; } -- 2.28.0
[PATCH 9/9] lib: zstd: Remove zstd compatibility wrapper
From: Nick Terrell All callers have been transitioned to the new zstd-1.4.6 API. There are no more callers of the zstd compatibility wrapper, so delete it. Signed-off-by: Nick Terrell --- include/linux/zstd_compat.h | 112 1 file changed, 112 deletions(-) delete mode 100644 include/linux/zstd_compat.h diff --git a/include/linux/zstd_compat.h b/include/linux/zstd_compat.h deleted file mode 100644 index 11acf14d9d70.. --- a/include/linux/zstd_compat.h +++ /dev/null @@ -1,112 +0,0 @@ -/* - * Copyright (c) 2016-present, Facebook, Inc. - * All rights reserved. - * - * This source code is licensed under the BSD-style license found in the - * LICENSE file in the root directory of https://github.com/facebook/zstd. - * An additional grant of patent rights can be found in the PATENTS file in the - * same directory. - * - * This program is free software; you can redistribute it and/or modify it under - * the terms of the GNU General Public License version 2 as published by the - * Free Software Foundation. This program is dual-licensed; you may select - * either version 2 of the GNU General Public License ("GPL") or BSD license - * ("BSD"). - */ - -#ifndef ZSTD_COMPAT_H -#define ZSTD_COMPAT_H - -#include - -#if defined(ZSTD_VERSION_NUMBER) && (ZSTD_VERSION_NUMBER >= 10406) -/* - * This header provides backwards compatibility for the zstd-1.4.6 library - * upgrade. This header allows us to upgrade the zstd library version without - * modifying any callers. Then we will migrate callers from the compatibility - * wrapper one at a time until none remain. At which point we will delete this - * header. - * - * It is temporary and will be deleted once the upgrade is complete. - */ - -#include - -static inline size_t ZSTD_CCtxWorkspaceBound(ZSTD_compressionParameters compression_params) -{ -return ZSTD_estimateCCtxSize_usingCParams(compression_params); -} - -static inline size_t ZSTD_CStreamWorkspaceBound(ZSTD_compressionParameters compression_params) -{ -return ZSTD_estimateCStreamSize_usingCParams(compression_params); -} - -static inline size_t ZSTD_DCtxWorkspaceBound(void) -{ -return ZSTD_estimateDCtxSize(); -} - -static inline size_t ZSTD_DStreamWorkspaceBound(unsigned long long window_size) -{ -return ZSTD_estimateDStreamSize(window_size); -} - -static inline ZSTD_CCtx* ZSTD_initCCtx(void* wksp, size_t wksp_size) -{ -if (wksp == NULL) -return NULL; -return ZSTD_initStaticCCtx(wksp, wksp_size); -} - -static inline ZSTD_CStream* ZSTD_initCStream_compat(ZSTD_parameters params, size_t pledged_src_size, void* wksp, size_t wksp_size) -{ -ZSTD_CStream* cstream; -size_t ret; - -if (wksp == NULL) -return NULL; - -cstream = ZSTD_initStaticCStream(wksp, wksp_size); -if (cstream == NULL) -return NULL; - -ret = ZSTD_initCStream_advanced(cstream, NULL, 0, params, pledged_src_size); -if (ZSTD_isError(ret)) -return NULL; - -return cstream; -} -#define ZSTD_initCStream ZSTD_initCStream_compat - -static inline ZSTD_DCtx* ZSTD_initDCtx(void* wksp, size_t wksp_size) -{ -if (wksp == NULL) -return NULL; -return ZSTD_initStaticDCtx(wksp, wksp_size); -} - -static inline ZSTD_DStream* ZSTD_initDStream_compat(unsigned long long window_size, void* wksp, size_t wksp_size) -{ -if (wksp == NULL) -return NULL; -(void)window_size; -return ZSTD_initStaticDStream(wksp, wksp_size); -} -#define ZSTD_initDStream ZSTD_initDStream_compat - -typedef ZSTD_frameHeader ZSTD_frameParams; - -static inline size_t ZSTD_getFrameParams(ZSTD_frameParams* frame_params, const void* src, size_t src_size) -{ -return ZSTD_getFrameHeader(frame_params, src, src_size); -} - -static inline size_t ZSTD_compressCCtx_compat(ZSTD_CCtx* cctx, void* dst, size_t dst_capacity, const void* src, size_t src_size, ZSTD_parameters params) -{ -return ZSTD_compress_advanced(cctx, dst, dst_capacity, src, src_size, NULL, 0, params); -} -#define ZSTD_compressCCtx ZSTD_compressCCtx_compat - -#endif /* ZSTD_VERSION_NUMBER >= 10406 */ -#endif /* ZSTD_COMPAT_H */ -- 2.28.0
[PATCH 8/9] lib: unzstd: Switch to the zstd-1.4.6 API
From: Nick Terrell Move away from the compatibility wrapper to the zstd-1.4.6 API. This code is functionally equivalent. Signed-off-by: Nick Terrell --- lib/decompress_unzstd.c | 40 ++-- 1 file changed, 14 insertions(+), 26 deletions(-) diff --git a/lib/decompress_unzstd.c b/lib/decompress_unzstd.c index a79f705f236d..d4685df0e120 100644 --- a/lib/decompress_unzstd.c +++ b/lib/decompress_unzstd.c @@ -73,7 +73,8 @@ #include #include -#include +#include +#include /* 128MB is the maximum window size supported by zstd. */ #define ZSTD_WINDOWSIZE_MAX(1 << ZSTD_WINDOWLOG_MAX) @@ -120,9 +121,9 @@ static int INIT decompress_single(const u8 *in_buf, long in_len, u8 *out_buf, long out_len, long *in_pos, void (*error)(char *x)) { - const size_t wksp_size = ZSTD_DCtxWorkspaceBound(); + const size_t wksp_size = ZSTD_estimateDCtxSize(); void *wksp = large_malloc(wksp_size); - ZSTD_DCtx *dctx = ZSTD_initDCtx(wksp, wksp_size); + ZSTD_DCtx *dctx = ZSTD_initStaticDCtx(wksp, wksp_size); int err; size_t ret; @@ -165,7 +166,6 @@ static int INIT __unzstd(unsigned char *in_buf, long in_len, { ZSTD_inBuffer in; ZSTD_outBuffer out; - ZSTD_frameParams params; void *in_allocated = NULL; void *out_allocated = NULL; void *wksp = NULL; @@ -229,36 +229,24 @@ static int INIT __unzstd(unsigned char *in_buf, long in_len, out.size = out_len; /* -* We need to know the window size to allocate the ZSTD_DStream. -* Since we are streaming, we need to allocate a buffer for the sliding -* window. The window size varies from 1 KB to ZSTD_WINDOWSIZE_MAX -* (8 MB), so it is important to use the actual value so as not to -* waste memory when it is smaller. +* Zstd determines the workspace size from the window size written +* into the frame header. This ensures that we use the minimum value +* possible, since the window size varies from 1 KB to ZSTD_WINDOWSIZE_MAX +* (1 GB), so it is very important to use the actual value. */ - ret = ZSTD_getFrameParams(, in.src, in.size); + wksp_size = ZSTD_estimateDStreamSize_fromFrame(in.src, in.size); err = handle_zstd_error(ret, error); if (err) goto out; - if (ret != 0) { - error("ZSTD-compressed data has an incomplete frame header"); - err = -1; - goto out; - } - if (params.windowSize > ZSTD_WINDOWSIZE_MAX) { - error("ZSTD-compressed data has too large a window size"); + wksp = large_malloc(wksp_size); + if (wksp == NULL) { + error("Out of memory while allocating ZSTD_DStream"); err = -1; goto out; } - - /* -* Allocate the ZSTD_DStream now that we know how much memory is -* required. -*/ - wksp_size = ZSTD_DStreamWorkspaceBound(params.windowSize); - wksp = large_malloc(wksp_size); - dstream = ZSTD_initDStream(params.windowSize, wksp, wksp_size); + dstream = ZSTD_initStaticDStream(wksp, wksp_size); if (dstream == NULL) { - error("Out of memory while allocating ZSTD_DStream"); + error("ZSTD_initStaticDStream failed"); err = -1; goto out; } -- 2.28.0
[PATCH 2/9] lib: zstd: Add decompress_sources.h for decompress_unzstd
From: Nick Terrell Adds decompress_sources.h which includes every .c file necessary for zstd decompression. This is used in decompress_unzstd.c so the internal structure of the library isn't exposed. This allows us to upgrade the zstd library version without modifying any callers. Instead we just need to update decompress_sources.h. Signed-off-by: Nick Terrell --- lib/decompress_unzstd.c | 6 +- lib/zstd/decompress_sources.h | 14 ++ 2 files changed, 15 insertions(+), 5 deletions(-) create mode 100644 lib/zstd/decompress_sources.h diff --git a/lib/decompress_unzstd.c b/lib/decompress_unzstd.c index dbc290af26b4..a79f705f236d 100644 --- a/lib/decompress_unzstd.c +++ b/lib/decompress_unzstd.c @@ -68,11 +68,7 @@ #ifdef STATIC # define UNZSTD_PREBOOT # include "xxhash.c" -# include "zstd/entropy_common.c" -# include "zstd/fse_decompress.c" -# include "zstd/huf_decompress.c" -# include "zstd/zstd_common.c" -# include "zstd/decompress.c" +# include "zstd/decompress_sources.h" #endif #include diff --git a/lib/zstd/decompress_sources.h b/lib/zstd/decompress_sources.h new file mode 100644 index ..ccb4960ea0cd --- /dev/null +++ b/lib/zstd/decompress_sources.h @@ -0,0 +1,14 @@ +// SPDX-License-Identifier: GPL-2.0-only + +/* + * This file includes every .c file needed for decompression. + * It is used by lib/decompress_unzstd.c to include the decompression + * source into the translation-unit, so it can be used for kernel + * decompression. + */ + +#include "entropy_common.c" +#include "fse_decompress.c" +#include "huf_decompress.c" +#include "zstd_common.c" +#include "decompress.c" -- 2.28.0
[PATCH 6/9] f2fs: zstd: Switch to the zstd-1.4.6 API
From: Nick Terrell Move away from the compatibility wrapper to the zstd-1.4.6 API. This code is more efficient because it uses the single-pass API instead of the streaming API. The streaming API is not necessary because the whole input and output buffers are available. This saves memory because we don't need to allocate a buffer for the window. It is also more efficient because it saves unnecessary memcpy calls. I've had problems testing this code because I see data truncation before and after this patchset. Help testing this patch would be much appreciated. Signed-off-by: Nick Terrell --- fs/f2fs/compress.c | 102 + 1 file changed, 38 insertions(+), 64 deletions(-) diff --git a/fs/f2fs/compress.c b/fs/f2fs/compress.c index e056f3a2b404..b79efce81651 100644 --- a/fs/f2fs/compress.c +++ b/fs/f2fs/compress.c @@ -11,7 +11,8 @@ #include #include #include -#include +#include +#include #include "f2fs.h" #include "node.h" @@ -298,21 +299,21 @@ static const struct f2fs_compress_ops f2fs_lz4_ops = { static int zstd_init_compress_ctx(struct compress_ctx *cc) { ZSTD_parameters params; - ZSTD_CStream *stream; + ZSTD_CCtx *ctx; void *workspace; unsigned int workspace_size; params = ZSTD_getParams(F2FS_ZSTD_DEFAULT_CLEVEL, cc->rlen, 0); - workspace_size = ZSTD_CStreamWorkspaceBound(params.cParams); + workspace_size = ZSTD_estimateCCtxSize_usingCParams(params.cParams); workspace = f2fs_kvmalloc(F2FS_I_SB(cc->inode), workspace_size, GFP_NOFS); if (!workspace) return -ENOMEM; - stream = ZSTD_initCStream(params, 0, workspace, workspace_size); - if (!stream) { - printk_ratelimited("%sF2FS-fs (%s): %s ZSTD_initCStream failed\n", + ctx = ZSTD_initStaticCCtx(workspace, workspace_size); + if (!ctx) { + printk_ratelimited("%sF2FS-fs (%s): %s ZSTD_inittaticCStream failed\n", KERN_ERR, F2FS_I_SB(cc->inode)->sb->s_id, __func__); kvfree(workspace); @@ -320,7 +321,7 @@ static int zstd_init_compress_ctx(struct compress_ctx *cc) } cc->private = workspace; - cc->private2 = stream; + cc->private2 = ctx; cc->clen = cc->rlen - PAGE_SIZE - COMPRESS_HEADER_SIZE; return 0; @@ -335,65 +336,48 @@ static void zstd_destroy_compress_ctx(struct compress_ctx *cc) static int zstd_compress_pages(struct compress_ctx *cc) { - ZSTD_CStream *stream = cc->private2; - ZSTD_inBuffer inbuf; - ZSTD_outBuffer outbuf; - int src_size = cc->rlen; - int dst_size = src_size - PAGE_SIZE - COMPRESS_HEADER_SIZE; - int ret; - - inbuf.pos = 0; - inbuf.src = cc->rbuf; - inbuf.size = src_size; - - outbuf.pos = 0; - outbuf.dst = cc->cbuf->cdata; - outbuf.size = dst_size; - - ret = ZSTD_compressStream(stream, , ); - if (ZSTD_isError(ret)) { - printk_ratelimited("%sF2FS-fs (%s): %s ZSTD_compressStream failed, ret: %d\n", - KERN_ERR, F2FS_I_SB(cc->inode)->sb->s_id, - __func__, ZSTD_getErrorCode(ret)); - return -EIO; - } - - ret = ZSTD_endStream(stream, ); + ZSTD_CCtx *ctx = cc->private2; + const size_t src_size = cc->rlen; + const size_t dst_size = src_size - PAGE_SIZE - COMPRESS_HEADER_SIZE; + ZSTD_parameters params = ZSTD_getParams(F2FS_ZSTD_DEFAULT_CLEVEL, src_size, 0); + size_t ret; + + ret = ZSTD_compress_advanced( + ctx, cc->cbuf->cdata, dst_size, cc->rbuf, src_size, NULL, 0, params); if (ZSTD_isError(ret)) { - printk_ratelimited("%sF2FS-fs (%s): %s ZSTD_endStream returned %d\n", + /* +* there is compressed data remained in intermediate buffer due to +* no more space in cbuf.cdata +*/ + if (ZSTD_getErrorCode(ret) == ZSTD_error_dstSize_tooSmall) + return -EAGAIN; + /* other compression errors return -EIO */ + printk_ratelimited("%sF2FS-fs (%s): %s ZSTD_compress_advanced failed, err: %s\n", KERN_ERR, F2FS_I_SB(cc->inode)->sb->s_id, - __func__, ZSTD_getErrorCode(ret)); + __func__, ZSTD_getErrorName(ret)); return -EIO; } - /* -* there is compressed data remained in intermediate buffer due to -* no more space in cbuf.cdata -*/ - if (ret) - return -EAGAIN; - - cc->clen = outbuf.
[PATCH 5/9] btrfs: zstd: Switch to the zstd-1.4.6 API
From: Nick Terrell Move away from the compatibility wrapper to the zstd-1.4.6 API. This code is functionally equivalent. Signed-off-by: Nick Terrell --- fs/btrfs/zstd.c | 48 1 file changed, 28 insertions(+), 20 deletions(-) diff --git a/fs/btrfs/zstd.c b/fs/btrfs/zstd.c index a7367ff573d4..6b466e090cd7 100644 --- a/fs/btrfs/zstd.c +++ b/fs/btrfs/zstd.c @@ -16,7 +16,7 @@ #include #include #include -#include +#include #include "misc.h" #include "compression.h" #include "ctree.h" @@ -159,8 +159,8 @@ static void zstd_calc_ws_mem_sizes(void) zstd_get_btrfs_parameters(level, ZSTD_BTRFS_MAX_INPUT); size_t level_size = max_t(size_t, - ZSTD_CStreamWorkspaceBound(params.cParams), - ZSTD_DStreamWorkspaceBound(ZSTD_BTRFS_MAX_INPUT)); + ZSTD_estimateCStreamSize_usingCParams(params.cParams), + ZSTD_estimateDStreamSize(ZSTD_BTRFS_MAX_INPUT)); max_size = max_t(size_t, max_size, level_size); zstd_ws_mem_sizes[level - 1] = max_size; @@ -389,13 +389,23 @@ int zstd_compress_pages(struct list_head *ws, struct address_space *mapping, *total_in = 0; /* Initialize the stream */ - stream = ZSTD_initCStream(params, len, workspace->mem, - workspace->size); + stream = ZSTD_initStaticCStream(workspace->mem, workspace->size); if (!stream) { - pr_warn("BTRFS: ZSTD_initCStream failed\n"); + pr_warn("BTRFS: ZSTD_initStaticCStream failed\n"); ret = -EIO; goto out; } + { + size_t ret2; + + ret2 = ZSTD_initCStream_advanced(stream, NULL, 0, params, len); + if (ZSTD_isError(ret2)) { + pr_warn("BTRFS: ZSTD_initCStream_advanced returned %s\n", + ZSTD_getErrorName(ret2)); + ret = -EIO; + goto out; + } + } /* map in the first page of input data */ in_page = find_get_page(mapping, start >> PAGE_SHIFT); @@ -421,8 +431,8 @@ int zstd_compress_pages(struct list_head *ws, struct address_space *mapping, ret2 = ZSTD_compressStream(stream, >out_buf, >in_buf); if (ZSTD_isError(ret2)) { - pr_debug("BTRFS: ZSTD_compressStream returned %d\n", - ZSTD_getErrorCode(ret2)); + pr_debug("BTRFS: ZSTD_compressStream returned %s\n", + ZSTD_getErrorName(ret2)); ret = -EIO; goto out; } @@ -489,8 +499,8 @@ int zstd_compress_pages(struct list_head *ws, struct address_space *mapping, ret2 = ZSTD_endStream(stream, >out_buf); if (ZSTD_isError(ret2)) { - pr_debug("BTRFS: ZSTD_endStream returned %d\n", - ZSTD_getErrorCode(ret2)); + pr_debug("BTRFS: ZSTD_endStream returned %s\n", + ZSTD_getErrorName(ret2)); ret = -EIO; goto out; } @@ -557,10 +567,9 @@ int zstd_decompress_bio(struct list_head *ws, struct compressed_bio *cb) unsigned long buf_start; unsigned long total_out = 0; - stream = ZSTD_initDStream( - ZSTD_BTRFS_MAX_INPUT, workspace->mem, workspace->size); + stream = ZSTD_initStaticDStream(workspace->mem, workspace->size); if (!stream) { - pr_debug("BTRFS: ZSTD_initDStream failed\n"); + pr_debug("BTRFS: ZSTD_initStaticDStream failed\n"); ret = -EIO; goto done; } @@ -579,8 +588,8 @@ int zstd_decompress_bio(struct list_head *ws, struct compressed_bio *cb) ret2 = ZSTD_decompressStream(stream, >out_buf, >in_buf); if (ZSTD_isError(ret2)) { - pr_debug("BTRFS: ZSTD_decompressStream returned %d\n", - ZSTD_getErrorCode(ret2)); + pr_debug("BTRFS: ZSTD_decompressStream returned %s\n", + ZSTD_getErrorName(ret2)); ret = -EIO; goto done; } @@ -633,10 +642,9 @@ int zstd_decompress(struct list_head *ws, unsigned char *data_in, unsigned long pg_offset = 0; char *kaddr; -
[PATCH 1/9] lib: zstd: Add zstd compatibility wrapper
From: Nick Terrell Adds zstd_compat.h which provides the necessary functions from the current zstd.h API. It is only active for zstd versions 1.4.6 and newer. That means it is disabled currently, but will become active when a later patch in this series updates the zstd library in the kernel to 1.4.6. This header allows the zstd upgrade to 1.4.6 without changing any callers, since they all include zstd through the compatibility wrapper. Later patches in this series transition each caller away from the compatibility wrapper. After all the callers have been transitioned away from the compatibility wrapper, the final patch in this series deletes it. Signed-off-by: Nick Terrell --- crypto/zstd.c | 2 +- fs/btrfs/zstd.c | 2 +- fs/f2fs/compress.c | 2 +- fs/squashfs/zstd_wrapper.c | 2 +- include/linux/zstd_compat.h | 112 lib/decompress_unzstd.c | 2 +- 6 files changed, 117 insertions(+), 5 deletions(-) create mode 100644 include/linux/zstd_compat.h diff --git a/crypto/zstd.c b/crypto/zstd.c index 1a3309f066f7..dcda3cad3b5c 100644 --- a/crypto/zstd.c +++ b/crypto/zstd.c @@ -11,7 +11,7 @@ #include #include #include -#include +#include #include diff --git a/fs/btrfs/zstd.c b/fs/btrfs/zstd.c index 9a4871636c6c..a7367ff573d4 100644 --- a/fs/btrfs/zstd.c +++ b/fs/btrfs/zstd.c @@ -16,7 +16,7 @@ #include #include #include -#include +#include #include "misc.h" #include "compression.h" #include "ctree.h" diff --git a/fs/f2fs/compress.c b/fs/f2fs/compress.c index 1dfb126a0cb2..e056f3a2b404 100644 --- a/fs/f2fs/compress.c +++ b/fs/f2fs/compress.c @@ -11,7 +11,7 @@ #include #include #include -#include +#include #include "f2fs.h" #include "node.h" diff --git a/fs/squashfs/zstd_wrapper.c b/fs/squashfs/zstd_wrapper.c index b7cb1faa652d..f8c512a6204e 100644 --- a/fs/squashfs/zstd_wrapper.c +++ b/fs/squashfs/zstd_wrapper.c @@ -11,7 +11,7 @@ #include #include #include -#include +#include #include #include "squashfs_fs.h" diff --git a/include/linux/zstd_compat.h b/include/linux/zstd_compat.h new file mode 100644 index ..11acf14d9d70 --- /dev/null +++ b/include/linux/zstd_compat.h @@ -0,0 +1,112 @@ +/* + * Copyright (c) 2016-present, Facebook, Inc. + * All rights reserved. + * + * This source code is licensed under the BSD-style license found in the + * LICENSE file in the root directory of https://github.com/facebook/zstd. + * An additional grant of patent rights can be found in the PATENTS file in the + * same directory. + * + * This program is free software; you can redistribute it and/or modify it under + * the terms of the GNU General Public License version 2 as published by the + * Free Software Foundation. This program is dual-licensed; you may select + * either version 2 of the GNU General Public License ("GPL") or BSD license + * ("BSD"). + */ + +#ifndef ZSTD_COMPAT_H +#define ZSTD_COMPAT_H + +#include + +#if defined(ZSTD_VERSION_NUMBER) && (ZSTD_VERSION_NUMBER >= 10406) +/* + * This header provides backwards compatibility for the zstd-1.4.6 library + * upgrade. This header allows us to upgrade the zstd library version without + * modifying any callers. Then we will migrate callers from the compatibility + * wrapper one at a time until none remain. At which point we will delete this + * header. + * + * It is temporary and will be deleted once the upgrade is complete. + */ + +#include + +static inline size_t ZSTD_CCtxWorkspaceBound(ZSTD_compressionParameters compression_params) +{ +return ZSTD_estimateCCtxSize_usingCParams(compression_params); +} + +static inline size_t ZSTD_CStreamWorkspaceBound(ZSTD_compressionParameters compression_params) +{ +return ZSTD_estimateCStreamSize_usingCParams(compression_params); +} + +static inline size_t ZSTD_DCtxWorkspaceBound(void) +{ +return ZSTD_estimateDCtxSize(); +} + +static inline size_t ZSTD_DStreamWorkspaceBound(unsigned long long window_size) +{ +return ZSTD_estimateDStreamSize(window_size); +} + +static inline ZSTD_CCtx* ZSTD_initCCtx(void* wksp, size_t wksp_size) +{ +if (wksp == NULL) +return NULL; +return ZSTD_initStaticCCtx(wksp, wksp_size); +} + +static inline ZSTD_CStream* ZSTD_initCStream_compat(ZSTD_parameters params, size_t pledged_src_size, void* wksp, size_t wksp_size) +{ +ZSTD_CStream* cstream; +size_t ret; + +if (wksp == NULL) +return NULL; + +cstream = ZSTD_initStaticCStream(wksp, wksp_size); +if (cstream == NULL) +return NULL; + +ret = ZSTD_initCStream_advanced(cstream, NULL, 0, params, pledged_src_size); +if (ZSTD_isError(ret)) +return NULL; + +return cstream; +} +#define ZSTD_initCStream ZSTD_initCStream_compat + +static inline ZSTD_DCtx* ZSTD_initDCtx(void* wksp, size_t wksp_size) +{ +if (wksp == NULL) +retu
[PATCH 0/9] Update to zstd-1.4.6
From: Nick Terrell This patchset upgrades the zstd library to the latest upstream release. The current zstd version in the kernel is a modified version of upstream zstd-1.3.1. At the time it was integrated, zstd wasn't ready to be used in the kernel as-is. But, it is now possible to use upstream zstd directly in the kernel. I have not yet release zstd-1.4.6 upstream. I want the zstd version in the kernel to match up with a known upstream release, so we know exactly what code is running. Whenever this patchset is ready for merge, I will cut a release at the upstream commit that gets merged. This should not be necessary for future releases. The kernel zstd library is automatically generated from upstream zstd. A script makes the necessary changes and imports it into the kernel. The changes are: 1. Replace all libc dependencies with kernel replacements and rewrite includes. 2. Remove unncessary portability macros like: #if defined(_MSC_VER). 3. Use the kernel xxhash instead of bundling it. This automation gets tested every commit by upstream's continuous integration. When we cut a new zstd release, we will submit a patch to the kernel to update the zstd version in the kernel. I've updated zstd to upstream with one big patch because every commit must build, so that precludes partial updates. Since the commit is 100% generated, I hope the review burden is lightened. I considered replaying upstream commits, but that is not possible because there have been ~3500 upstream commits since the last zstd import, and the commits don't all build individually. The bulk update preserves bisectablity because bugs can be bisected to the zstd version update. At that point the update can be reverted, and we can work with upstream to find and fix the bug. After this big switch in how the kernel consumes zstd, future patches will be smaller, because they will only have one upstream release worth of changes each. This patchset comes in 3 parts: 1. The first 2 patches prepare for the zstd upgrade. The first patch adds a compatibility wrapper so zstd can be upgraded without modifying any callers. The second patch adds an indirection for the lib/decompress_unzstd.c including of all decompression source files. 2. Import zstd-1.4.6. This patch is completely generated from upstream using automated tooling. 3. Update all callers to the zstd-1.4.6 API then delete the compatibility wrapper. I tested every caller of zstd on x86_64. I tested both after the 1.4.6 upgrade using the compatibility wrapper, and after the final patch in this series. I had problems with F2FS, where I had file truncation both before and after this series, so I would appreciate help testing it. All other callers were good. I tested kernel and initramfs decompression in i386 and arm. I ran benchmarks to compare the current zstd in the kernel with zstd-1.4.6. I benchmarked on x86_64 using QEMU with KVM enabled on an Intel i9-9900k. I found: * BtrFS zstd compression at levels 1 and 3 is 5% faster * BtrFS zstd decompression+read is 15% faster * SquashFS zstd decompression+read is 15% faster * ZRAM decompression+read is 30% faster * Kernel zstd decompression is 35% faster * Initramfs zstd decompression+build is 5% faster The latest zstd also offers bug fixes and a 1 KB reduction in stack uage during compression. Please let me know if there is anything that I can do to ease the way for these patches. I think it is important because it gets large performance improvements, contains bug fixes, and is switching to a more maintainable model of consuming upstream zstd directly, making it easy to keep up to date. Best, Nick Terrell Nick Terrell (9): lib: zstd: Add zstd compatibility wrapper lib: zstd: Add decompress_sources.h for decompress_unzstd lib: zstd: Upgrade to latest upstream zstd version 1.4.6 crypto: zstd: Switch to zstd-1.4.6 API btrfs: zstd: Switch to the zstd-1.4.6 API f2fs: zstd: Switch to the zstd-1.4.6 API squashfs: zstd: Switch to the zstd-1.4.6 API lib: unzstd: Switch to the zstd-1.4.6 API lib: zstd: Remove zstd compatibility wrapper crypto/zstd.c | 22 +- fs/btrfs/zstd.c | 46 +- fs/f2fs/compress.c| 100 +- fs/squashfs/zstd_wrapper.c|7 +- include/linux/zstd.h | 3019 include/linux/zstd_errors.h | 76 + lib/decompress_unzstd.c | 44 +- lib/zstd/Makefile | 35 +- lib/zstd/bitstream.h | 379 -- lib/zstd/common/bitstream.h | 437 ++ lib/zstd/common/compiler.h| 134 + lib/zstd/common/cpu.h | 194 + lib/zstd/common/debug.c | 24 + lib/zstd/common/debug.h | 101 + lib/zstd/common/entropy_common.c | 355 ++ lib/zstd/common
Re: [PATCH] zstd: Fix decompression of large window archives on 32-bit platforms
On Sun, Sep 13, 2020 at 11:19 PM Petr Malat wrote: > > It seems some optimization has been removed from the code without removing > the if condition which should activate it only on 64-bit platforms and as > a result the code responsible for decompression with window larger than > 8MB was disabled on 32-bit platforms. > > Signed-off-by: Petr Malat Reviewed-by: Nick Terrell Thanks for the fix! I looked upstream and this fix corresponds to this upstream commit: https://github.com/facebook/zstd/commit/8a5c0c98ae5a7884694589d7a69bc99011add94d Thanks, Nick Terrell > --- > lib/zstd/decompress.c | 8 ++-- > 1 file changed, 2 insertions(+), 6 deletions(-) > > diff --git a/lib/zstd/decompress.c b/lib/zstd/decompress.c > index db6761ea4deb..509a3b8d51b9 100644 > --- a/lib/zstd/decompress.c > +++ b/lib/zstd/decompress.c > @@ -1457,12 +1457,8 @@ static size_t ZSTD_decompressBlock_internal(ZSTD_DCtx > *dctx, void *dst, size_t d > ip += litCSize; > srcSize -= litCSize; > } > - if (sizeof(size_t) > 4) /* do not enable prefetching on 32-bits x86, > as it's performance detrimental */ > - /* likely because of register pressure */ > - /* if that's the correct cause, then 32-bits > ARM should be affected differently */ > - /* it would be good to test this on ARM real > hardware, to see if prefetch version improves speed */ > - if (dctx->fParams.windowSize > (1 << 23)) > - return ZSTD_decompressSequencesLong(dctx, dst, > dstCapacity, ip, srcSize); > + if (dctx->fParams.windowSize > (1 << 23)) > + return ZSTD_decompressSequencesLong(dctx, dst, dstCapacity, > ip, srcSize); > return ZSTD_decompressSequences(dctx, dst, dstCapacity, ip, srcSize); > } > > -- > 2.20.1 >
Re: [PATCH v3 1/2] lib: decompress_unzstd: Limit output size
> On Sep 1, 2020, at 7:26 AM, Paul Cercueil wrote: > > The zstd decompression code, as it is right now, will most likely fail > on 32-bit systems, as the default output buffer size causes the buffer's > end address to overflow. > > Address this issue by setting a sane default to the default output size, > with a value that won't overflow the buffer's end address. > > Signed-off-by: Paul Cercueil > --- > > Notes: >v2: Change limit to 1 GiB > >v3: Compute size limit instead of using hardcoded value > > lib/decompress_unzstd.c | 7 ++- > 1 file changed, 6 insertions(+), 1 deletion(-) > > diff --git a/lib/decompress_unzstd.c b/lib/decompress_unzstd.c > index 0ad2c15479ed..790abc472f5b 100644 > --- a/lib/decompress_unzstd.c > +++ b/lib/decompress_unzstd.c > @@ -178,8 +178,13 @@ static int INIT __unzstd(unsigned char *in_buf, long > in_len, > int err; > size_t ret; > > + /* > + * ZSTD decompression code won't be happy if the buffer size is so big > + * that its end address overflows. When the size is not provided, make > + * it as big as possible without having the end address overflow. > + */ > if (out_len == 0) > - out_len = LONG_MAX; /* no limit */ > + out_len = UINTPTR_MAX - (uintptr_t)out_buf; Great, that works for me. Thanks for fixing this! Reviewed-by: Nick Terrell > if (fill == NULL && flush == NULL) > /* > -- > 2.28.0 >
Re: [PATCH v2 1/2] lib: decompress_unzstd: Limit output size
> On Aug 25, 2020, at 2:01 PM, Paul Cercueil wrote: > > The zstd decompression code, as it is right now, will have internal > values overflow on 32-bit systems when the output size is bigger than > 1 GiB. > > Until someone smarter than me can figure out how to fix the zstd code > properly, limit the destination buffer size to 1 GiB, which should be > enough for everybody, in order to make it usable on 32-bit systems. I was talking with Yann Collet, and we believe that it isn’t the long that is overflowing, but the pointers. Zstd expects to be given a valid output size. It generally uses a begin/end pointer with its output buffer. So when it is given a very large output size in 32-bit mode the end pointer will overflow the pointer either causing UB, or end pointer < begin pointer, which breaks zstd. Zstd will probably never be able to work properly in this way. A better solution might be to pass MAX_ADDRESS_PTR - OUTPUT_PTR as the size to the __decompress() call. Or some other size that won’t overflow the pointer. Best, Nick > Signed-off-by: Paul Cercueil > Reviewed-by: Nick Terrell > --- > > Notes: >v2: Change limit to 1 GiB > > lib/decompress_unzstd.c | 3 ++- > 1 file changed, 2 insertions(+), 1 deletion(-) > > diff --git a/lib/decompress_unzstd.c b/lib/decompress_unzstd.c > index 0ad2c15479ed..414517baedb0 100644 > --- a/lib/decompress_unzstd.c > +++ b/lib/decompress_unzstd.c > @@ -77,6 +77,7 @@ > > #include > #include > +#include > #include > > /* 128MB is the maximum window size supported by zstd. */ > @@ -179,7 +180,7 @@ static int INIT __unzstd(unsigned char *in_buf, long > in_len, > size_t ret; > > if (out_len == 0) > - out_len = LONG_MAX; /* no limit */ > + out_len = SZ_1G; /* should be big enough, right? */ > > if (fill == NULL && flush == NULL) > /* > -- > 2.28.0 >
Re: [PATCH 1/2] lib: decompress_unzstd: Limit output size
> On Aug 24, 2020, at 2:05 PM, Paul Cercueil wrote: > > > > Le lun. 24 août 2020 à 20:11, Nick Terrell a écrit : >>> On Aug 21, 2020, at 9:29 AM, Paul Cercueil wrote: >>> The zstd decompression code, as it is right now, will have internal >>> values overflow on 32-bit systems when the output size is LONG_MAX. >>> Until someone smarter than me can figure out how to fix the zstd code >>> properly, limit the destination buffer size to 512 MiB, which should be >>> enough for everybody, in order to make it usable on 32-bit systems. >> Can you bump the size up to 2GB? I suspect the problem inside of zstd >> is an off-by-one error or something similar, so getting closer to the limit >> shouldn't be a problem. I’d feel more comfortable with 2GB, since >> kernels can get pretty large. > > SZ_1G is the biggest I can go to get the kernel to boot. With SZ_2G it won't > boot. Strange… I don’t quite know what is going on then. Thanks for the fix! You can add: Reviewed-By: Nick Terrell Best, Nick >> Hmm, zstd shouldn’t be overflowing that value. I’m currently preparing >> a patch to updating the version of zstd in the kernel, and using upstream >> directly. I will add a test upstream in 32-bit mode to ensure that we don’t >> overflow a 32-bit size_t, so this will be fixed after the update. > > Great, thanks. > > Cheers, > -Paul > >> -Nick >>> Signed-off-by: Paul Cercueil >>> --- >>> lib/decompress_unzstd.c | 3 ++- >>> 1 file changed, 2 insertions(+), 1 deletion(-) >>> diff --git a/lib/decompress_unzstd.c b/lib/decompress_unzstd.c >>> index 0ad2c15479ed..e1c03b1eaa6e 100644 >>> --- a/lib/decompress_unzstd.c >>> +++ b/lib/decompress_unzstd.c >>> @@ -77,6 +77,7 @@ >>> #include >>> #include >>> +#include >>> #include >>> /* 128MB is the maximum window size supported by zstd. */ >>> @@ -179,7 +180,7 @@ static int INIT __unzstd(unsigned char *in_buf, long >>> in_len, >>> size_t ret; >>> if (out_len == 0) >>> - out_len = LONG_MAX; /* no limit */ >>> + out_len = SZ_512M; /* should be big enough, right? */ >>> if (fill == NULL && flush == NULL) >>> /* >>> -- >>> 2.28.0 > >
Re: [PATCH 2/2] MIPS: Add support for ZSTD-compressed kernels
> On Aug 24, 2020, at 2:02 PM, Paul Cercueil wrote: > > Hi Nick, > > Le lun. 24 août 2020 à 19:51, Nick Terrell a écrit : >>> On Aug 21, 2020, at 9:29 AM, Paul Cercueil wrote: >>> Add support for self-extracting kernels with a ZSTD compression. >>> Tested on a kernel for the GCW-Zero, it allows to reduce the size of the >>> kernel file from 4.1 MiB with gzip to 3.5 MiB with ZSTD, and boots just >>> as fast. >>> Signed-off-by: Paul Cercueil >>> --- >>> arch/mips/Kconfig | 1 + >>> arch/mips/boot/compressed/Makefile | 1 + >>> arch/mips/boot/compressed/decompress.c | 4 >>> arch/mips/boot/compressed/string.c | 16 >>> 4 files changed, 22 insertions(+) >>> diff --git a/arch/mips/Kconfig b/arch/mips/Kconfig >>> index c95fa3a2484c..b9d7c4249dc9 100644 >>> --- a/arch/mips/Kconfig >>> +++ b/arch/mips/Kconfig >>> @@ -1890,6 +1890,7 @@ config SYS_SUPPORTS_ZBOOT >>> select HAVE_KERNEL_LZMA >>> select HAVE_KERNEL_LZO >>> select HAVE_KERNEL_XZ >>> + select HAVE_KERNEL_ZSTD >>> config SYS_SUPPORTS_ZBOOT_UART16550 >>> bool >>> diff --git a/arch/mips/boot/compressed/Makefile >>> b/arch/mips/boot/compressed/Makefile >>> index 6e56caef69f0..86ddc6fc16f4 100644 >>> --- a/arch/mips/boot/compressed/Makefile >>> +++ b/arch/mips/boot/compressed/Makefile >>> @@ -70,6 +70,7 @@ tool_$(CONFIG_KERNEL_LZ4) = lz4 >>> tool_$(CONFIG_KERNEL_LZMA)= lzma >>> tool_$(CONFIG_KERNEL_LZO) = lzo >>> tool_$(CONFIG_KERNEL_XZ) = xzkern >>> +tool_$(CONFIG_KERNEL_ZSTD)= zstd >> You can use zstd22 here. It will give you slightly better compression >> without any additional memory usage. Also, you should add >> -D__DISABLE_EXPORTS to the KBUILD_CFLAGS like x86 does [1]. > > Indeed, it's 0.01% smaller :) > > What is __DISABLE_EXPORTS for? It disables the EXPORT_SYMBOL() macros inside of lib/zstd/decompress.c. On x86 the kernel won’t boot with these defined. Other decompressors hide them if the STATIC macro is defined, but zstd uses this method, which was added somewhat recently. -Nick > -Paul > >> [1] >> https://github.com/torvalds/linux/blob/master/arch/x86/boot/compressed/Makefile >> -Nick >>> targets += vmlinux.bin.z >>> $(obj)/vmlinux.bin.z: $(obj)/vmlinux.bin FORCE >>> diff --git a/arch/mips/boot/compressed/decompress.c >>> b/arch/mips/boot/compressed/decompress.c >>> index 88f5d637b1c4..c61c641674e6 100644 >>> --- a/arch/mips/boot/compressed/decompress.c >>> +++ b/arch/mips/boot/compressed/decompress.c >>> @@ -72,6 +72,10 @@ void error(char *x) >>> #include "../../../../lib/decompress_unxz.c" >>> #endif >>> +#ifdef CONFIG_KERNEL_ZSTD >>> +#include "../../../../lib/decompress_unzstd.c" >>> +#endif >>> + >>> const unsigned long __stack_chk_guard = 0x000a0dff; >>> void __stack_chk_fail(void) >>> diff --git a/arch/mips/boot/compressed/string.c >>> b/arch/mips/boot/compressed/string.c >>> index 43beecc3587c..ab95722ec0c9 100644 >>> --- a/arch/mips/boot/compressed/string.c >>> +++ b/arch/mips/boot/compressed/string.c >>> @@ -27,3 +27,19 @@ void *memset(void *s, int c, size_t n) >>> ss[i] = c; >>> return s; >>> } >>> + >>> +void *memmove(void *dest, const void *src, size_t n) >>> +{ >>> + unsigned int i; >>> + const char *s = src; >>> + char *d = dest; >>> + >>> + if ((uintptr_t)dest < (uintptr_t)src) { >>> + for (i = 0; i < n; i++) >>> + d[i] = s[i]; >>> + } else { >>> + for (i = n; i > 0; i--) >>> + d[i - 1] = s[i - 1]; >>> + } >>> + return dest; >>> +} >>> -- >>> 2.28.0