Re: [PATCH v6 1/3] lib: zstd: Add kernel-specific API

2020-12-03 Thread Nick Terrell


> On Dec 2, 2020, at 9:03 PM, Michał Mirosław  wrote:
> 
> On Thu, Dec 03, 2020 at 03:59:21AM +, Nick Terrell wrote:
>> On Dec 2, 2020, at 7:14 PM, Michał Mirosław  wrote:
>>> On Thu, Dec 03, 2020 at 01:42:03AM +, Nick Terrell wrote:
 On Dec 2, 2020, at 5:16 PM, Michał Mirosław  
 wrote:
> On Wed, Dec 02, 2020 at 12:32:40PM -0800, Nick Terrell wrote:
>> From: Nick Terrell 
>> 
>> This patch:
>> - Moves `include/linux/zstd.h` -> `lib/zstd/zstd.h`
>> - Adds a new API in `include/linux/zstd.h` that is functionally
>> equivalent to the in-use subset of the current API. Functions are
>> renamed to avoid symbol collisions with zstd, to make it clear it is
>> not the upstream zstd API, and to follow the kernel style guide.
>> - Updates all callers to use the new API.
>> 
>> There are no functional changes in this patch. Since there are no
>> functional change, I felt it was okay to update all the callers in a
>> single patch, since once the API is approved, the callers are
>> mechanically changed.
> [...]
>> --- a/lib/decompress_unzstd.c
>> +++ b/lib/decompress_unzstd.c
> [...]
>> static int INIT handle_zstd_error(size_t ret, void (*error)(char *x))
>> {
>> -const int err = ZSTD_getErrorCode(ret);
>> -
>> -if (!ZSTD_isError(ret))
>> +if (!zstd_is_error(ret))
>>  return 0;
>> 
>> -switch (err) {
>> -case ZSTD_error_memory_allocation:
>> -error("ZSTD decompressor ran out of memory");
>> -break;
>> -case ZSTD_error_prefix_unknown:
>> -error("Input is not in the ZSTD format (wrong magic 
>> bytes)");
>> -break;
>> -case ZSTD_error_dstSize_tooSmall:
>> -case ZSTD_error_corruption_detected:
>> -case ZSTD_error_checksum_wrong:
>> -error("ZSTD-compressed data is corrupt");
>> -break;
>> -default:
>> -error("ZSTD-compressed data is probably corrupt");
>> -break;
>> -}
>> +error("ZSTD decompression failed");
>>  return -1;
>> }
> 
> This looses diagnostics specificity - is this intended? At least the
> out-of-memory condition seems useful to distinguish.
 
 Good point. The zstd API no longer exposes the error code enum,
 but it does expose zstd_get_error_name() which can be used here.
 I was thinking that the string needed to be static for some reason, but
 that is not the case. I will make that change.
 
>> +size_t zstd_compress_stream(zstd_cstream *cstream,
>> +struct zstd_out_buffer *output, struct zstd_in_buffer *input)
>> +{
>> +ZSTD_outBuffer o;
>> +ZSTD_inBuffer i;
>> +size_t ret;
>> +
>> +memcpy(, output, sizeof(o));
>> +memcpy(, input, sizeof(i));
>> +ret = ZSTD_compressStream(cstream, , );
>> +memcpy(output, , sizeof(o));
>> +memcpy(input, , sizeof(i));
>> +return ret;
>> +}
> 
> Is all this copying necessary? How is it different from type-punning by
> direct pointer cast?
 
 If breaking strict aliasing and type-punning by pointer casing is okay, 
 then
 we can do that here. These memcpys will be negligible for performance, but
 type-punning would be more succinct if allowed.
>>> 
>>> Ah, this might break LTO builds due to strict aliasing violation.
>>> So I would suggest to just #define the ZSTD names to kernel ones
>>> for the library code.  Unless there is a cleaner solution...
>> 
>> I don’t want to do that because I want in the 3rd series of the patchset I 
>> update
>> to zstd-1.4.6. And I’m using zstd-1.4.6 as-is in upstream. This allows us to 
>> keep
>> the kernel version up to date, since the patch to update to a new version 
>> can be
>> generated automatically (and manually tested), so it doesn’t fall years 
>> behind
>> upstream again.
>> 
>> The alternative would be to make upstream zstd’s header public and
>> #define zstd_in_buffer ZSTD_inBuffer. But that would make zstd’s header
>> public, which would somewhat defeat the purpose of having a kernel wrapper.
> 
> I thought the problem was API style spill-over from the library to other parts
> of the kernel.  A header-only wrapper can stop this.  I'm not sure symbol
> visibility (namespace pollution) was a concern.

Thanks for the review Michał! I have just submitted a new version of the patches
with the suggested changes!

Best,
Nick Terrell

> Best Regards
> Michał Mirosław



Re: [PATCH v6 1/3] lib: zstd: Add kernel-specific API

2020-12-02 Thread Nick Terrell


> On Dec 2, 2020, at 9:03 PM, Michał Mirosław  wrote:
> 
> On Thu, Dec 03, 2020 at 03:59:21AM +, Nick Terrell wrote:
>> On Dec 2, 2020, at 7:14 PM, Michał Mirosław  wrote:
>>> On Thu, Dec 03, 2020 at 01:42:03AM +, Nick Terrell wrote:
 On Dec 2, 2020, at 5:16 PM, Michał Mirosław  
 wrote:
> On Wed, Dec 02, 2020 at 12:32:40PM -0800, Nick Terrell wrote:
>> From: Nick Terrell 
>> 
>> This patch:
>> - Moves `include/linux/zstd.h` -> `lib/zstd/zstd.h`
>> - Adds a new API in `include/linux/zstd.h` that is functionally
>> equivalent to the in-use subset of the current API. Functions are
>> renamed to avoid symbol collisions with zstd, to make it clear it is
>> not the upstream zstd API, and to follow the kernel style guide.
>> - Updates all callers to use the new API.
>> 
>> There are no functional changes in this patch. Since there are no
>> functional change, I felt it was okay to update all the callers in a
>> single patch, since once the API is approved, the callers are
>> mechanically changed.
> [...]
>> --- a/lib/decompress_unzstd.c
>> +++ b/lib/decompress_unzstd.c
> [...]
>> static int INIT handle_zstd_error(size_t ret, void (*error)(char *x))
>> {
>> -const int err = ZSTD_getErrorCode(ret);
>> -
>> -if (!ZSTD_isError(ret))
>> +if (!zstd_is_error(ret))
>>  return 0;
>> 
>> -switch (err) {
>> -case ZSTD_error_memory_allocation:
>> -error("ZSTD decompressor ran out of memory");
>> -break;
>> -case ZSTD_error_prefix_unknown:
>> -error("Input is not in the ZSTD format (wrong magic 
>> bytes)");
>> -break;
>> -case ZSTD_error_dstSize_tooSmall:
>> -case ZSTD_error_corruption_detected:
>> -case ZSTD_error_checksum_wrong:
>> -error("ZSTD-compressed data is corrupt");
>> -break;
>> -default:
>> -error("ZSTD-compressed data is probably corrupt");
>> -break;
>> -}
>> +error("ZSTD decompression failed");
>>  return -1;
>> }
> 
> This looses diagnostics specificity - is this intended? At least the
> out-of-memory condition seems useful to distinguish.
 
 Good point. The zstd API no longer exposes the error code enum,
 but it does expose zstd_get_error_name() which can be used here.
 I was thinking that the string needed to be static for some reason, but
 that is not the case. I will make that change.
 
>> +size_t zstd_compress_stream(zstd_cstream *cstream,
>> +struct zstd_out_buffer *output, struct zstd_in_buffer *input)
>> +{
>> +ZSTD_outBuffer o;
>> +ZSTD_inBuffer i;
>> +size_t ret;
>> +
>> +memcpy(, output, sizeof(o));
>> +memcpy(, input, sizeof(i));
>> +ret = ZSTD_compressStream(cstream, , );
>> +memcpy(output, , sizeof(o));
>> +memcpy(input, , sizeof(i));
>> +return ret;
>> +}
> 
> Is all this copying necessary? How is it different from type-punning by
> direct pointer cast?
 
 If breaking strict aliasing and type-punning by pointer casing is okay, 
 then
 we can do that here. These memcpys will be negligible for performance, but
 type-punning would be more succinct if allowed.
>>> 
>>> Ah, this might break LTO builds due to strict aliasing violation.
>>> So I would suggest to just #define the ZSTD names to kernel ones
>>> for the library code.  Unless there is a cleaner solution...
>> 
>> I don’t want to do that because I want in the 3rd series of the patchset I 
>> update
>> to zstd-1.4.6. And I’m using zstd-1.4.6 as-is in upstream. This allows us to 
>> keep
>> the kernel version up to date, since the patch to update to a new version 
>> can be
>> generated automatically (and manually tested), so it doesn’t fall years 
>> behind
>> upstream again.
>> 
>> The alternative would be to make upstream zstd’s header public and
>> #define zstd_in_buffer ZSTD_inBuffer. But that would make zstd’s header
>> public, which would somewhat defeat the purpose of having a kernel wrapper.
> 
> I thought the problem was API style spill-over from the library to other parts
> of the kernel.  A header-only wrapper can stop this.  I'm not sure symbol
> visibility (namespace pollution) was a concern.

Thats true. It seems slightly unclean, but so Is duplicating these structs and 
memcpying
them. So I’ll go ahead and expose the upstream zstd’s header (“lib/zstd/zstd.h” 
here).
I’ll just need to pick a name for the upstream “zstd.h” header.

> Best Regards
> Michał Mirosław



Re: [PATCH v6 1/3] lib: zstd: Add kernel-specific API

2020-12-02 Thread Michał Mirosław
On Thu, Dec 03, 2020 at 03:59:21AM +, Nick Terrell wrote:
> On Dec 2, 2020, at 7:14 PM, Michał Mirosław  wrote:
> > On Thu, Dec 03, 2020 at 01:42:03AM +, Nick Terrell wrote:
> >> On Dec 2, 2020, at 5:16 PM, Michał Mirosław  
> >> wrote:
> >>> On Wed, Dec 02, 2020 at 12:32:40PM -0800, Nick Terrell wrote:
>  From: Nick Terrell 
>  
>  This patch:
>  - Moves `include/linux/zstd.h` -> `lib/zstd/zstd.h`
>  - Adds a new API in `include/linux/zstd.h` that is functionally
>  equivalent to the in-use subset of the current API. Functions are
>  renamed to avoid symbol collisions with zstd, to make it clear it is
>  not the upstream zstd API, and to follow the kernel style guide.
>  - Updates all callers to use the new API.
>  
>  There are no functional changes in this patch. Since there are no
>  functional change, I felt it was okay to update all the callers in a
>  single patch, since once the API is approved, the callers are
>  mechanically changed.
> >>> [...]
>  --- a/lib/decompress_unzstd.c
>  +++ b/lib/decompress_unzstd.c
> >>> [...]
>  static int INIT handle_zstd_error(size_t ret, void (*error)(char *x))
>  {
>  -const int err = ZSTD_getErrorCode(ret);
>  -
>  -if (!ZSTD_isError(ret))
>  +if (!zstd_is_error(ret))
>   return 0;
>  
>  -switch (err) {
>  -case ZSTD_error_memory_allocation:
>  -error("ZSTD decompressor ran out of memory");
>  -break;
>  -case ZSTD_error_prefix_unknown:
>  -error("Input is not in the ZSTD format (wrong magic 
>  bytes)");
>  -break;
>  -case ZSTD_error_dstSize_tooSmall:
>  -case ZSTD_error_corruption_detected:
>  -case ZSTD_error_checksum_wrong:
>  -error("ZSTD-compressed data is corrupt");
>  -break;
>  -default:
>  -error("ZSTD-compressed data is probably corrupt");
>  -break;
>  -}
>  +error("ZSTD decompression failed");
>   return -1;
>  }
> >>> 
> >>> This looses diagnostics specificity - is this intended? At least the
> >>> out-of-memory condition seems useful to distinguish.
> >> 
> >> Good point. The zstd API no longer exposes the error code enum,
> >> but it does expose zstd_get_error_name() which can be used here.
> >> I was thinking that the string needed to be static for some reason, but
> >> that is not the case. I will make that change.
> >> 
>  +size_t zstd_compress_stream(zstd_cstream *cstream,
>  +struct zstd_out_buffer *output, struct zstd_in_buffer *input)
>  +{
>  +ZSTD_outBuffer o;
>  +ZSTD_inBuffer i;
>  +size_t ret;
>  +
>  +memcpy(, output, sizeof(o));
>  +memcpy(, input, sizeof(i));
>  +ret = ZSTD_compressStream(cstream, , );
>  +memcpy(output, , sizeof(o));
>  +memcpy(input, , sizeof(i));
>  +return ret;
>  +}
> >>> 
> >>> Is all this copying necessary? How is it different from type-punning by
> >>> direct pointer cast?
> >> 
> >> If breaking strict aliasing and type-punning by pointer casing is okay, 
> >> then
> >> we can do that here. These memcpys will be negligible for performance, but
> >> type-punning would be more succinct if allowed.
> > 
> > Ah, this might break LTO builds due to strict aliasing violation.
> > So I would suggest to just #define the ZSTD names to kernel ones
> > for the library code.  Unless there is a cleaner solution...
> 
> I don’t want to do that because I want in the 3rd series of the patchset I 
> update
> to zstd-1.4.6. And I’m using zstd-1.4.6 as-is in upstream. This allows us to 
> keep
> the kernel version up to date, since the patch to update to a new version can 
> be
> generated automatically (and manually tested), so it doesn’t fall years behind
> upstream again.
> 
> The alternative would be to make upstream zstd’s header public and
> #define zstd_in_buffer ZSTD_inBuffer. But that would make zstd’s header
> public, which would somewhat defeat the purpose of having a kernel wrapper.

I thought the problem was API style spill-over from the library to other parts
of the kernel.  A header-only wrapper can stop this.  I'm not sure symbol
visibility (namespace pollution) was a concern.

Best Regards
Michał Mirosław


Re: [PATCH v6 1/3] lib: zstd: Add kernel-specific API

2020-12-02 Thread Nick Terrell


> On Dec 2, 2020, at 7:14 PM, Michał Mirosław  wrote:
> 
> On Thu, Dec 03, 2020 at 01:42:03AM +, Nick Terrell wrote:
>> 
>> 
>>> On Dec 2, 2020, at 5:16 PM, Michał Mirosław  wrote:
>>> 
>>> On Wed, Dec 02, 2020 at 12:32:40PM -0800, Nick Terrell wrote:
 From: Nick Terrell 
 
 This patch:
 - Moves `include/linux/zstd.h` -> `lib/zstd/zstd.h`
 - Adds a new API in `include/linux/zstd.h` that is functionally
 equivalent to the in-use subset of the current API. Functions are
 renamed to avoid symbol collisions with zstd, to make it clear it is
 not the upstream zstd API, and to follow the kernel style guide.
 - Updates all callers to use the new API.
 
 There are no functional changes in this patch. Since there are no
 functional change, I felt it was okay to update all the callers in a
 single patch, since once the API is approved, the callers are
 mechanically changed.
>>> [...]
 --- a/lib/decompress_unzstd.c
 +++ b/lib/decompress_unzstd.c
>>> [...]
 static int INIT handle_zstd_error(size_t ret, void (*error)(char *x))
 {
 -  const int err = ZSTD_getErrorCode(ret);
 -
 -  if (!ZSTD_isError(ret))
 +  if (!zstd_is_error(ret))
return 0;
 
 -  switch (err) {
 -  case ZSTD_error_memory_allocation:
 -  error("ZSTD decompressor ran out of memory");
 -  break;
 -  case ZSTD_error_prefix_unknown:
 -  error("Input is not in the ZSTD format (wrong magic bytes)");
 -  break;
 -  case ZSTD_error_dstSize_tooSmall:
 -  case ZSTD_error_corruption_detected:
 -  case ZSTD_error_checksum_wrong:
 -  error("ZSTD-compressed data is corrupt");
 -  break;
 -  default:
 -  error("ZSTD-compressed data is probably corrupt");
 -  break;
 -  }
 +  error("ZSTD decompression failed");
return -1;
 }
>>> 
>>> This looses diagnostics specificity - is this intended? At least the
>>> out-of-memory condition seems useful to distinguish.
>> 
>> Good point. The zstd API no longer exposes the error code enum,
>> but it does expose zstd_get_error_name() which can be used here.
>> I was thinking that the string needed to be static for some reason, but
>> that is not the case. I will make that change.
>> 
 +size_t zstd_compress_stream(zstd_cstream *cstream,
 +  struct zstd_out_buffer *output, struct zstd_in_buffer *input)
 +{
 +  ZSTD_outBuffer o;
 +  ZSTD_inBuffer i;
 +  size_t ret;
 +
 +  memcpy(, output, sizeof(o));
 +  memcpy(, input, sizeof(i));
 +  ret = ZSTD_compressStream(cstream, , );
 +  memcpy(output, , sizeof(o));
 +  memcpy(input, , sizeof(i));
 +  return ret;
 +}
>>> 
>>> Is all this copying necessary? How is it different from type-punning by
>>> direct pointer cast?
>> 
>> If breaking strict aliasing and type-punning by pointer casing is okay, then
>> we can do that here. These memcpys will be negligible for performance, but
>> type-punning would be more succinct if allowed.
> 
> Ah, this might break LTO builds due to strict aliasing violation.
> So I would suggest to just #define the ZSTD names to kernel ones
> for the library code.  Unless there is a cleaner solution...

I don’t want to do that because I want in the 3rd series of the patchset I 
update
to zstd-1.4.6. And I’m using zstd-1.4.6 as-is in upstream. This allows us to 
keep
the kernel version up to date, since the patch to update to a new version can be
generated automatically (and manually tested), so it doesn’t fall years behind
upstream again.

The alternative would be to make upstream zstd’s header public and
#define zstd_in_buffer ZSTD_inBuffer. But that would make zstd’s header
public, which would somewhat defeat the purpose of having a kernel wrapper.

These memcpy’s won’t hurt performance, since this function is called at most
every 4KB of input data in all the callers, though they are clunky.



Re: [PATCH v6 1/3] lib: zstd: Add kernel-specific API

2020-12-02 Thread Michał Mirosław
On Thu, Dec 03, 2020 at 01:42:03AM +, Nick Terrell wrote:
> 
> 
> > On Dec 2, 2020, at 5:16 PM, Michał Mirosław  wrote:
> > 
> > On Wed, Dec 02, 2020 at 12:32:40PM -0800, Nick Terrell wrote:
> >> From: Nick Terrell 
> >> 
> >> This patch:
> >> - Moves `include/linux/zstd.h` -> `lib/zstd/zstd.h`
> >> - Adds a new API in `include/linux/zstd.h` that is functionally
> >>  equivalent to the in-use subset of the current API. Functions are
> >>  renamed to avoid symbol collisions with zstd, to make it clear it is
> >>  not the upstream zstd API, and to follow the kernel style guide.
> >> - Updates all callers to use the new API.
> >> 
> >> There are no functional changes in this patch. Since there are no
> >> functional change, I felt it was okay to update all the callers in a
> >> single patch, since once the API is approved, the callers are
> >> mechanically changed.
> > [...]
> >> --- a/lib/decompress_unzstd.c
> >> +++ b/lib/decompress_unzstd.c
> > [...]
> >> static int INIT handle_zstd_error(size_t ret, void (*error)(char *x))
> >> {
> >> -  const int err = ZSTD_getErrorCode(ret);
> >> -
> >> -  if (!ZSTD_isError(ret))
> >> +  if (!zstd_is_error(ret))
> >>return 0;
> >> 
> >> -  switch (err) {
> >> -  case ZSTD_error_memory_allocation:
> >> -  error("ZSTD decompressor ran out of memory");
> >> -  break;
> >> -  case ZSTD_error_prefix_unknown:
> >> -  error("Input is not in the ZSTD format (wrong magic bytes)");
> >> -  break;
> >> -  case ZSTD_error_dstSize_tooSmall:
> >> -  case ZSTD_error_corruption_detected:
> >> -  case ZSTD_error_checksum_wrong:
> >> -  error("ZSTD-compressed data is corrupt");
> >> -  break;
> >> -  default:
> >> -  error("ZSTD-compressed data is probably corrupt");
> >> -  break;
> >> -  }
> >> +  error("ZSTD decompression failed");
> >>return -1;
> >> }
> > 
> > This looses diagnostics specificity - is this intended? At least the
> > out-of-memory condition seems useful to distinguish.
> 
> Good point. The zstd API no longer exposes the error code enum,
> but it does expose zstd_get_error_name() which can be used here.
> I was thinking that the string needed to be static for some reason, but
> that is not the case. I will make that change.
> 
> >> +size_t zstd_compress_stream(zstd_cstream *cstream,
> >> +  struct zstd_out_buffer *output, struct zstd_in_buffer *input)
> >> +{
> >> +  ZSTD_outBuffer o;
> >> +  ZSTD_inBuffer i;
> >> +  size_t ret;
> >> +
> >> +  memcpy(, output, sizeof(o));
> >> +  memcpy(, input, sizeof(i));
> >> +  ret = ZSTD_compressStream(cstream, , );
> >> +  memcpy(output, , sizeof(o));
> >> +  memcpy(input, , sizeof(i));
> >> +  return ret;
> >> +}
> > 
> > Is all this copying necessary? How is it different from type-punning by
> > direct pointer cast?
> 
> If breaking strict aliasing and type-punning by pointer casing is okay, then
> we can do that here. These memcpys will be negligible for performance, but
> type-punning would be more succinct if allowed.

Ah, this might break LTO builds due to strict aliasing violation.
So I would suggest to just #define the ZSTD names to kernel ones
for the library code.  Unless there is a cleaner solution...

Best Regards
Michał Mirosław


Re: [PATCH v6 1/3] lib: zstd: Add kernel-specific API

2020-12-02 Thread Nick Terrell


> On Dec 2, 2020, at 5:16 PM, Michał Mirosław  wrote:
> 
> On Wed, Dec 02, 2020 at 12:32:40PM -0800, Nick Terrell wrote:
>> From: Nick Terrell 
>> 
>> This patch:
>> - Moves `include/linux/zstd.h` -> `lib/zstd/zstd.h`
>> - Adds a new API in `include/linux/zstd.h` that is functionally
>>  equivalent to the in-use subset of the current API. Functions are
>>  renamed to avoid symbol collisions with zstd, to make it clear it is
>>  not the upstream zstd API, and to follow the kernel style guide.
>> - Updates all callers to use the new API.
>> 
>> There are no functional changes in this patch. Since there are no
>> functional change, I felt it was okay to update all the callers in a
>> single patch, since once the API is approved, the callers are
>> mechanically changed.
> [...]
>> --- a/lib/decompress_unzstd.c
>> +++ b/lib/decompress_unzstd.c
> [...]
>> static int INIT handle_zstd_error(size_t ret, void (*error)(char *x))
>> {
>> -const int err = ZSTD_getErrorCode(ret);
>> -
>> -if (!ZSTD_isError(ret))
>> +if (!zstd_is_error(ret))
>>  return 0;
>> 
>> -switch (err) {
>> -case ZSTD_error_memory_allocation:
>> -error("ZSTD decompressor ran out of memory");
>> -break;
>> -case ZSTD_error_prefix_unknown:
>> -error("Input is not in the ZSTD format (wrong magic bytes)");
>> -break;
>> -case ZSTD_error_dstSize_tooSmall:
>> -case ZSTD_error_corruption_detected:
>> -case ZSTD_error_checksum_wrong:
>> -error("ZSTD-compressed data is corrupt");
>> -break;
>> -default:
>> -error("ZSTD-compressed data is probably corrupt");
>> -break;
>> -}
>> +error("ZSTD decompression failed");
>>  return -1;
>> }
> 
> This looses diagnostics specificity - is this intended? At least the
> out-of-memory condition seems useful to distinguish.

Good point. The zstd API no longer exposes the error code enum,
but it does expose zstd_get_error_name() which can be used here.
I was thinking that the string needed to be static for some reason, but
that is not the case. I will make that change.

>> +size_t zstd_compress_stream(zstd_cstream *cstream,
>> +struct zstd_out_buffer *output, struct zstd_in_buffer *input)
>> +{
>> +ZSTD_outBuffer o;
>> +ZSTD_inBuffer i;
>> +size_t ret;
>> +
>> +memcpy(, output, sizeof(o));
>> +memcpy(, input, sizeof(i));
>> +ret = ZSTD_compressStream(cstream, , );
>> +memcpy(output, , sizeof(o));
>> +memcpy(input, , sizeof(i));
>> +return ret;
>> +}
> 
> Is all this copying necessary? How is it different from type-punning by
> direct pointer cast?

If breaking strict aliasing and type-punning by pointer casing is okay, then
we can do that here. These memcpys will be negligible for performance, but
type-punning would be more succinct if allowed.

Best,
Nick



Re: [PATCH v6 1/3] lib: zstd: Add kernel-specific API

2020-12-02 Thread Michał Mirosław
On Wed, Dec 02, 2020 at 12:32:40PM -0800, Nick Terrell wrote:
> From: Nick Terrell 
> 
> This patch:
> - Moves `include/linux/zstd.h` -> `lib/zstd/zstd.h`
> - Adds a new API in `include/linux/zstd.h` that is functionally
>   equivalent to the in-use subset of the current API. Functions are
>   renamed to avoid symbol collisions with zstd, to make it clear it is
>   not the upstream zstd API, and to follow the kernel style guide.
> - Updates all callers to use the new API.
> 
> There are no functional changes in this patch. Since there are no
> functional change, I felt it was okay to update all the callers in a
> single patch, since once the API is approved, the callers are
> mechanically changed.
[...]
> --- a/lib/decompress_unzstd.c
> +++ b/lib/decompress_unzstd.c
[...]
>  static int INIT handle_zstd_error(size_t ret, void (*error)(char *x))
>  {
> - const int err = ZSTD_getErrorCode(ret);
> -
> - if (!ZSTD_isError(ret))
> + if (!zstd_is_error(ret))
>   return 0;
>  
> - switch (err) {
> - case ZSTD_error_memory_allocation:
> - error("ZSTD decompressor ran out of memory");
> - break;
> - case ZSTD_error_prefix_unknown:
> - error("Input is not in the ZSTD format (wrong magic bytes)");
> - break;
> - case ZSTD_error_dstSize_tooSmall:
> - case ZSTD_error_corruption_detected:
> - case ZSTD_error_checksum_wrong:
> - error("ZSTD-compressed data is corrupt");
> - break;
> - default:
> - error("ZSTD-compressed data is probably corrupt");
> - break;
> - }
> + error("ZSTD decompression failed");
>   return -1;
>  }

This looses diagnostics specificity - is this intended? At least the
out-of-memory condition seems useful to distinguish.

> +size_t zstd_compress_stream(zstd_cstream *cstream,
> + struct zstd_out_buffer *output, struct zstd_in_buffer *input)
> +{
> + ZSTD_outBuffer o;
> + ZSTD_inBuffer i;
> + size_t ret;
> +
> + memcpy(, output, sizeof(o));
> + memcpy(, input, sizeof(i));
> + ret = ZSTD_compressStream(cstream, , );
> + memcpy(output, , sizeof(o));
> + memcpy(input, , sizeof(i));
> + return ret;
> +}

Is all this copying necessary? How is it different from type-punning by
direct pointer cast?

Best Regards
Michał Mirosław