Re: [RESEND PATCH] crypto: Add zstd support

2018-03-21 Thread Sergey Senozhatsky
On (03/21/18 15:49), Nick Terrell wrote:
> depends on CONFIG_CRYPTO_ZSTD, which isn't defined until this patch is in

Yikes! How come I missed that... :)

> [0] 
> lkml.kernel.org/r/9c9416b2dff19f05fb4c35879aaa83d11ff72c92.1521626182.git.geliangt...@gmail.com

Signed-off-by: Nick Terrell  ?

-ss


Re: [PATCH 0/1] cover-letter/lz4: Implement lz4 with dynamic offset length.

2018-03-21 Thread Sergey Senozhatsky
On (03/21/18 19:56), Nick Terrell wrote:
[..]
> This seems like a reasonable extension to the algorithm, and it looks like
> LZ4_DYN is about a 5% improvement to compression ratio on your benchmark.
> The biggest question I have is if it is worthwhile to maintain a separate
> incompatible variant of LZ4 in the kernel without any upstream for a 5%
> gain? If we do want to go forward with this, we should perform more
> benchmarks.
> 
> I commented in the patch, but because the `dynOffset` variable isn't a
> compile time static in LZ4_decompress_generic(), I suspect that the patch
> causes a regression in decompression speed for both LZ4 and LZ4_DYN. You'll
> need to re-run the benchmarks to first show that LZ4 before the patch
> performs the same as LZ4 after the patch. Then re-run the LZ4 vs LZ4_DYN
> benchmarks.
> 
> I would also like to see a benchmark in user-space (with the code), so we
> can see the performance of LZ4 before and after the patch, as well as LZ4
> vs LZ4_DYN without anything else going on. I expect the extra branches in
> the decoding loop to have an impact on speed, and I would like to see how
> big the impact is without noise. 

Yes, I've been thinking about this. There are more branches now
("to dyn or not to dyn") in compression/decompression hot path but
we see less instructions and branches in perf output at the end.
And my guess is that we have a lot of noise from zram and zsmalloc.
The data is XXX bytes shorter with dyn enabled, so we use YYY less
moves and ZZZ less branches while we copy the data to zsmalloc and
from zsmalloc, and I may be that's the root cause of "performance
gain" that we see in zram-fio tests. So may be we need to run
benchmarks against lz4, not zram+lz4.

> CC-ing Yann Collet, the author of LZ4

Great, thanks.

-ss


Re: [PATCH v6 07/12] integrity: Select CONFIG_KEYS instead of depending on it

2018-03-21 Thread Mimi Zohar
On Fri, 2018-03-16 at 17:38 -0300, Thiago Jung Bauermann wrote:
> This avoids a dependency cycle in CONFIG_IMA_APPRAISE_MODSIG (introduced by
> a later patch in this series): it will select CONFIG_MODULE_SIG_FORMAT
> which in turn selects CONFIG_KEYS. Kconfig then complains that
> CONFIG_INTEGRITY_SIGNATURE depends on CONFIG_KEYS.
> 
> Signed-off-by: Thiago Jung Bauermann 

Signed-off-by: Mimi Zohar 

> ---
>  security/integrity/Kconfig | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
> 
> diff --git a/security/integrity/Kconfig b/security/integrity/Kconfig
> index da9565891738..0d642e0317c7 100644
> --- a/security/integrity/Kconfig
> +++ b/security/integrity/Kconfig
> @@ -17,8 +17,8 @@ if INTEGRITY
> 
>  config INTEGRITY_SIGNATURE
>   bool "Digital signature verification using multiple keyrings"
> - depends on KEYS
>   default n
> + select KEYS
>   select SIGNATURE
>   help
> This option enables digital signature verification support
> 



Re: [PATCH v6 06/12] integrity: Introduce asymmetric_sig_has_known_key()

2018-03-21 Thread Mimi Zohar
On Fri, 2018-03-16 at 17:38 -0300, Thiago Jung Bauermann wrote:
> IMA will only look for a modsig if the xattr sig references a key which is
> not in the expected kernel keyring. To that end, introduce
> asymmetric_sig_has_known_key().
> 
> The logic of extracting the key used in the xattr sig is factored out from
> asymmetric_verify() so that it can be used by the new function.
> 
> Signed-off-by: Thiago Jung Bauermann 

Signed-off-by: Mimi Zohar 

> ---
>  security/integrity/digsig_asymmetric.c | 44 
> +-
>  security/integrity/integrity.h |  8 +++
>  2 files changed, 41 insertions(+), 11 deletions(-)
> 
> diff --git a/security/integrity/digsig_asymmetric.c 
> b/security/integrity/digsig_asymmetric.c
> index ab6a029062a1..241647970c19 100644
> --- a/security/integrity/digsig_asymmetric.c
> +++ b/security/integrity/digsig_asymmetric.c
> @@ -79,26 +79,48 @@ static struct key *request_asymmetric_key(struct key 
> *keyring, uint32_t keyid)
>   return key;
>  }
> 
> -int asymmetric_verify(struct key *keyring, const char *sig,
> -   int siglen, const char *data, int datalen)
> +static struct key *asymmetric_key_from_sig(struct key *keyring, const char 
> *sig,
> +int siglen)
>  {
> - struct public_key_signature pks;
> - struct signature_v2_hdr *hdr = (struct signature_v2_hdr *)sig;
> - struct key *key;
> - int ret = -ENOMEM;
> + const struct signature_v2_hdr *hdr = (struct signature_v2_hdr *) sig;
> 
>   if (siglen <= sizeof(*hdr))
> - return -EBADMSG;
> + return ERR_PTR(-EBADMSG);
> 
>   siglen -= sizeof(*hdr);
> 
>   if (siglen != be16_to_cpu(hdr->sig_size))
> - return -EBADMSG;
> + return ERR_PTR(-EBADMSG);
> 
>   if (hdr->hash_algo >= HASH_ALGO__LAST)
> - return -ENOPKG;
> + return ERR_PTR(-ENOPKG);
> +
> + return request_asymmetric_key(keyring, be32_to_cpu(hdr->keyid));
> +}
> +
> +bool asymmetric_sig_has_known_key(struct key *keyring, const char *sig,
> +   int siglen)
> +{
> + struct key *key;
> +
> + key = asymmetric_key_from_sig(keyring, sig, siglen);
> + if (IS_ERR_OR_NULL(key))
> + return false;
> +
> + key_put(key);
> +
> + return true;
> +}
> +
> +int asymmetric_verify(struct key *keyring, const char *sig,
> +   int siglen, const char *data, int datalen)
> +{
> + struct public_key_signature pks;
> + struct signature_v2_hdr *hdr = (struct signature_v2_hdr *)sig;
> + struct key *key;
> + int ret = -ENOMEM;
> 
> - key = request_asymmetric_key(keyring, be32_to_cpu(hdr->keyid));
> + key = asymmetric_key_from_sig(keyring, sig, siglen);
>   if (IS_ERR(key))
>   return PTR_ERR(key);
> 
> @@ -109,7 +131,7 @@ int asymmetric_verify(struct key *keyring, const char 
> *sig,
>   pks.digest = (u8 *)data;
>   pks.digest_size = datalen;
>   pks.s = hdr->sig;
> - pks.s_size = siglen;
> + pks.s_size = siglen - sizeof(*hdr);
>   ret = verify_signature(key, );
>   key_put(key);
>   pr_debug("%s() = %d\n", __func__, ret);
> diff --git a/security/integrity/integrity.h b/security/integrity/integrity.h
> index 2d245f44ca26..4c381b992e11 100644
> --- a/security/integrity/integrity.h
> +++ b/security/integrity/integrity.h
> @@ -179,12 +179,20 @@ static inline int integrity_init_keyring(const unsigned 
> int id)
>  #ifdef CONFIG_INTEGRITY_ASYMMETRIC_KEYS
>  int asymmetric_verify(struct key *keyring, const char *sig,
> int siglen, const char *data, int datalen);
> +bool asymmetric_sig_has_known_key(struct key *keyring, const char *sig,
> +   int siglen);
>  #else
>  static inline int asymmetric_verify(struct key *keyring, const char *sig,
>   int siglen, const char *data, int datalen)
>  {
>   return -EOPNOTSUPP;
>  }
> +
> +static inline bool asymmetric_sig_has_known_key(struct key *keyring,
> + const char *sig, int siglen)
> +{
> + return false;
> +}
>  #endif
> 
>  #ifdef CONFIG_IMA_LOAD_X509
> 



[RESEND PATCH] crypto: Add zstd support

2018-03-21 Thread Nick Terrell
Adds zstd support to crypto and scompress. Only supports the default
level.

Previously we held off on this patch, since there weren't any users.
Now zram is ready for zstd support, but depends on CONFIG_CRYPTO_ZSTD,
which isn't defined until this patch is in. I also see a patch adding
zstd to pstore [0], which depends on crypto zstd.

[0] 
lkml.kernel.org/r/9c9416b2dff19f05fb4c35879aaa83d11ff72c92.1521626182.git.geliangt...@gmail.com
---
 crypto/Kconfig   |   9 ++
 crypto/Makefile  |   1 +
 crypto/testmgr.c |  10 +++
 crypto/testmgr.h |  71 +++
 crypto/zstd.c| 265 +++
 5 files changed, 356 insertions(+)
 create mode 100644 crypto/zstd.c

diff --git a/crypto/Kconfig b/crypto/Kconfig
index b75264b..500583c 100644
--- a/crypto/Kconfig
+++ b/crypto/Kconfig
@@ -1677,6 +1677,15 @@ config CRYPTO_LZ4HC
help
  This is the LZ4 high compression mode algorithm.
 
+config CRYPTO_ZSTD
+   tristate "Zstd compression algorithm"
+   select CRYPTO_ALGAPI
+   select CRYPTO_ACOMP2
+   select ZSTD_COMPRESS
+   select ZSTD_DECOMPRESS
+   help
+ This is the zstd algorithm.
+
 comment "Random Number Generation"
 
 config CRYPTO_ANSI_CPRNG
diff --git a/crypto/Makefile b/crypto/Makefile
index cdbc03b..2900ab1 100644
--- a/crypto/Makefile
+++ b/crypto/Makefile
@@ -136,6 +136,7 @@ obj-$(CONFIG_CRYPTO_USER_API_HASH) += algif_hash.o
 obj-$(CONFIG_CRYPTO_USER_API_SKCIPHER) += algif_skcipher.o
 obj-$(CONFIG_CRYPTO_USER_API_RNG) += algif_rng.o
 obj-$(CONFIG_CRYPTO_USER_API_AEAD) += algif_aead.o
+obj-$(CONFIG_CRYPTO_ZSTD) += zstd.o
 
 ecdh_generic-y := ecc.o
 ecdh_generic-y += ecdh.o
diff --git a/crypto/testmgr.c b/crypto/testmgr.c
index d5e23a1..71b2798 100644
--- a/crypto/testmgr.c
+++ b/crypto/testmgr.c
@@ -3576,6 +3576,16 @@ static const struct alg_test_desc alg_test_descs[] = {
.decomp = 
__VECS(zlib_deflate_decomp_tv_template)
}
}
+   }, {
+   .alg = "zstd",
+   .test = alg_test_comp,
+   .fips_allowed = 1,
+   .suite = {
+   .comp = {
+   .comp = __VECS(zstd_comp_tv_template),
+   .decomp = __VECS(zstd_decomp_tv_template)
+   }
+   }
}
 };
 
diff --git a/crypto/testmgr.h b/crypto/testmgr.h
index 6044f69..9b5b930 100644
--- a/crypto/testmgr.h
+++ b/crypto/testmgr.h
@@ -35255,4 +35255,75 @@ static const struct comp_testvec 
lz4hc_decomp_tv_template[] = {
},
 };
 
+static const struct comp_testvec zstd_comp_tv_template[] = {
+   {
+   .inlen  = 68,
+   .outlen = 39,
+   .input  = "The algorithm is zstd. "
+ "The algorithm is zstd. "
+ "The algorithm is zstd.",
+   .output = "\x28\xb5\x2f\xfd\x00\x50\xf5\x00\x00\xb8\x54\x68\x65"
+ "\x20\x61\x6c\x67\x6f\x72\x69\x74\x68\x6d\x20\x69\x73"
+ "\x20\x7a\x73\x74\x64\x2e\x20\x01\x00\x55\x73\x36\x01"
+ ,
+   },
+   {
+   .inlen  = 244,
+   .outlen = 151,
+   .input  = "zstd, short for Zstandard, is a fast lossless "
+ "compression algorithm, targeting real-time "
+ "compression scenarios at zlib-level and better "
+ "compression ratios. The zstd compression library "
+ "provides in-memory compression and decompression "
+ "functions.",
+   .output = "\x28\xb5\x2f\xfd\x00\x50\x75\x04\x00\x42\x4b\x1e\x17"
+ "\x90\x81\x31\x00\xf2\x2f\xe4\x36\xc9\xef\x92\x88\x32"
+ "\xc9\xf2\x24\x94\xd8\x68\x9a\x0f\x00\x0c\xc4\x31\x6f"
+ "\x0d\x0c\x38\xac\x5c\x48\x03\xcd\x63\x67\xc0\xf3\xad"
+ "\x4e\x90\xaa\x78\xa0\xa4\xc5\x99\xda\x2f\xb6\x24\x60"
+ "\xe2\x79\x4b\xaa\xb6\x6b\x85\x0b\xc9\xc6\x04\x66\x86"
+ "\xe2\xcc\xe2\x25\x3f\x4f\x09\xcd\xb8\x9d\xdb\xc1\x90"
+ "\xa9\x11\xbc\x35\x44\x69\x2d\x9c\x64\x4f\x13\x31\x64"
+ "\xcc\xfb\x4d\x95\x93\x86\x7f\x33\x7f\x1a\xef\xe9\x30"
+ "\xf9\x67\xa1\x94\x0a\x69\x0f\x60\xcd\xc3\xab\x99\xdc"
+ "\x42\xed\x97\x05\x00\x33\xc3\x15\x95\x3a\x06\xa0\x0e"
+ "\x20\xa9\x0e\x82\xb9\x43\x45\x01",
+   },
+};
+
+static const struct comp_testvec zstd_decomp_tv_template[] = {
+   {
+   .inlen  = 43,
+   .outlen = 68,
+   .input  = "\x28\xb5\x2f\xfd\x04\x50\xf5\x00\x00\xb8\x54\x68\x65"
+ "\x20\x61\x6c\x67\x6f\x72\x69\x74\x68\x6d\x20\x69\x73"
+ 

Re: [PATCH v6 05/12] integrity: Introduce integrity_keyring_from_id()

2018-03-21 Thread Mimi Zohar
On Fri, 2018-03-16 at 17:38 -0300, Thiago Jung Bauermann wrote:
> IMA will need to obtain the keyring used to verify file signatures so that
> it can verify the module-style signature appended to files.
> 
> Signed-off-by: Thiago Jung Bauermann 

Signed-off-by: Mimi Zohar 

> ---
>  security/integrity/digsig.c| 28 +---
>  security/integrity/integrity.h |  6 ++
>  2 files changed, 27 insertions(+), 7 deletions(-)
> 
> diff --git a/security/integrity/digsig.c b/security/integrity/digsig.c
> index 6f9e4ce568cd..e641a67b9fc7 100644
> --- a/security/integrity/digsig.c
> +++ b/security/integrity/digsig.c
> @@ -48,11 +48,10 @@ static bool init_keyring __initdata;
>  #define restrict_link_to_ima restrict_link_by_builtin_trusted
>  #endif
> 
> -int integrity_digsig_verify(const unsigned int id, const char *sig, int 
> siglen,
> - const char *digest, int digestlen)
> +struct key *integrity_keyring_from_id(const unsigned int id)
>  {
> - if (id >= INTEGRITY_KEYRING_MAX || siglen < 2)
> - return -EINVAL;
> + if (id >= INTEGRITY_KEYRING_MAX)
> + return ERR_PTR(-EINVAL);
> 
>   if (!keyring[id]) {
>   keyring[id] =
> @@ -61,17 +60,32 @@ int integrity_digsig_verify(const unsigned int id, const 
> char *sig, int siglen,
>   int err = PTR_ERR(keyring[id]);
>   pr_err("no %s keyring: %d\n", keyring_name[id], err);
>   keyring[id] = NULL;
> - return err;
> + return ERR_PTR(err);
>   }
>   }
> 
> + return keyring[id];
> +}
> +
> +int integrity_digsig_verify(const unsigned int id, const char *sig, int 
> siglen,
> + const char *digest, int digestlen)
> +{
> + struct key *keyring;
> +
> + if (siglen < 2)
> + return -EINVAL;
> +
> + keyring = integrity_keyring_from_id(id);
> + if (IS_ERR(keyring))
> + return PTR_ERR(keyring);
> +
>   switch (sig[1]) {
>   case 1:
>   /* v1 API expect signature without xattr type */
> - return digsig_verify(keyring[id], sig + 1, siglen - 1,
> + return digsig_verify(keyring, sig + 1, siglen - 1,
>digest, digestlen);
>   case 2:
> - return asymmetric_verify(keyring[id], sig, siglen,
> + return asymmetric_verify(keyring, sig, siglen,
>digest, digestlen);
>   }
> 
> diff --git a/security/integrity/integrity.h b/security/integrity/integrity.h
> index 79799a0d9195..2d245f44ca26 100644
> --- a/security/integrity/integrity.h
> +++ b/security/integrity/integrity.h
> @@ -150,6 +150,7 @@ int integrity_kernel_read(struct file *file, loff_t 
> offset,
> 
>  #ifdef CONFIG_INTEGRITY_SIGNATURE
> 
> +struct key *integrity_keyring_from_id(const unsigned int id);
>  int integrity_digsig_verify(const unsigned int id, const char *sig, int 
> siglen,
>   const char *digest, int digestlen);
> 
> @@ -157,6 +158,11 @@ int __init integrity_init_keyring(const unsigned int id);
>  int __init integrity_load_x509(const unsigned int id, const char *path);
>  #else
> 
> +static inline struct key *integrity_keyring_from_id(const unsigned int id)
> +{
> + return ERR_PTR(-EINVAL);
> +}
> +
>  static inline int integrity_digsig_verify(const unsigned int id,
> const char *sig, int siglen,
> const char *digest, int digestlen)
> 



Re: [PATCH 1/5 v4] add compression algorithm zBeWalgo

2018-03-21 Thread Philippe Ombredanne
Benjamin,

On Wed, Mar 21, 2018 at 12:32 AM, Benjamin Warnke
<4bwar...@informatik.uni-hamburg.de> wrote:
> Ok, I will use
>
> /* SPDX-License-Identifier: GPL-2.0 */
> /*
>  * Copyright (c) 2018 Benjamin Warnke <4bwar...@informatik.uni-hamburg.de>
> ...
>
> at the top of my files instead of that boilerplate text. And
>
> MODULE_LICENSE("GPL");
>
> at the bottom of the module-files.


>
> I used the file "crypto/lz4.c" - since it is a compression algorithm too - as 
> an example of how to format the licensing text.
> Unfortunately there is the same 'error'.
> I fixed this error in all of my files in all patches.

Actually to be consistent if you want to use GPL-2-0 (and not "or
later") you should use:

1. at the top, for a c. file:
// SPDX-License-Identifier: GPL-2.0

or for a .h file:
/* SPDX-License-Identifier: GPL-2.0 */

The doc explains it all. Including the comment style (a topic that has
been discussed also on list quite bit: Linus had the final word there)

2. and in your MODULE_LICENSE macro:

MODULE_LICENSE("GPL v2");

 because a MODULE_LICENSE("GPL"); would mean GPL-2.0+ (e.g. or any
later version) and this would not match your top level license tag.

I know this may seem confusing, but there is little hope we can change
the MODULE_LICENSE tags that are used by many external module loaders.
Comments in module.h explain it all.

-- 
Cordially
Philippe Ombredanne


Re: [PATCH 1/1] lz4: Implement lz4 with dynamic offset length.

2018-03-21 Thread Nick Terrell
On (03/21/18 10:10), Maninder Singh wrote:
> diff --git a/lib/lz4/lz4_compress.c b/lib/lz4/lz4_compress.c
> index cc7b6d4..185c358 100644
> --- a/lib/lz4/lz4_compress.c
> +++ b/lib/lz4/lz4_compress.c
> @@ -183,7 +183,8 @@ static FORCE_INLINE int LZ4_compress_generic(
>   const tableType_t tableType,
>   const dict_directive dict,
>   const dictIssue_directive dictIssue,
> - const U32 acceleration)
> + const U32 acceleration,
> + const Dynamic_Offset dynOffset)
>  {
>   const BYTE *ip = (const BYTE *) source;
>   const BYTE *base;
> @@ -199,6 +200,7 @@ static FORCE_INLINE int LZ4_compress_generic(
>
>   BYTE *op = (BYTE *) dest;
>   BYTE * const olimit = op + maxOutputSize;
> + int max_distance = dynOffset ? MAX_DISTANCE_DYN : MAX_DISTANCE;

Lets mark this variable `const`. If the compiler doesn't realize that this
variable and `dynOffset` are compile time constants, I expect the speed to
be impacted.

>
>   U32 forwardH;
>   size_t refDelta = 0;
> @@ -245,6 +247,7 @@ static FORCE_INLINE int LZ4_compress_generic(
>   for ( ; ; ) {
>   const BYTE *match;
>   BYTE *token;
> + int curr_offset;
>
>   /* Find a match */
>   {
> @@ -285,7 +288,7 @@ static FORCE_INLINE int LZ4_compress_generic(
>   : 0)
>   || ((tableType == byU16)
>   ? 0
> - : (match + MAX_DISTANCE < ip))
> + : (match + max_distance < ip))
>   || (LZ4_read32(match + refDelta)
>   != LZ4_read32(ip)));
>   }
> @@ -328,8 +331,26 @@ static FORCE_INLINE int LZ4_compress_generic(
>
>  _next_match:
>   /* Encode Offset */
> - LZ4_writeLE16(op, (U16)(ip - match));
> - op += 2;
> + if (dynOffset) {
> + curr_offset = (U16)(ip - match);
> +
> + /*
> +  * If Ofsset is greater than 127, we need 2 bytes
> +  * to store it. Otherwise 1 byte is enough.
> +  */
> + if (curr_offset > 127) {
> + curr_offset = (curr_offset << 1) | DYN_BIT;
> + LZ4_writeLE16(op, (U16)curr_offset);
> + op += 2;
> + } else {
> + curr_offset = curr_offset << 1;
> + *op = (BYTE)curr_offset;
> + op++;
> + }

The standard way to do variable sized integers is to use the high-bit as
the control bit, not the low-bit. Do you have benchmarks to show that using
the low-bit is faster than using the high-bit? If so, lets comment in the
code, if not lets use the high-bit.

> + } else {
> + LZ4_writeLE16(op, (U16)(ip - match));
> + op += 2;
> + }
>
>   /* Encode MatchLength */
>   {
> @@ -480,39 +501,70 @@ static int LZ4_compress_fast_extState(
>   return LZ4_compress_generic(ctx, source,
>   dest, inputSize, 0,
>   noLimit, byU16, noDict,
> - noDictIssue, acceleration);
> + noDictIssue, acceleration, NoDynOffset);
>   else
>   return LZ4_compress_generic(ctx, source,
>   dest, inputSize, 0,
>   noLimit, tableType, noDict,
> - noDictIssue, acceleration);
> + noDictIssue, acceleration, NoDynOffset);
>   } else {
>   if (inputSize < LZ4_64Klimit)
>   return LZ4_compress_generic(ctx, source,
>   dest, inputSize,
>   maxOutputSize, limitedOutput, byU16, noDict,
> - noDictIssue, acceleration);
> + noDictIssue, acceleration, NoDynOffset);
>   else
>   return LZ4_compress_generic(ctx, source,
>   dest, inputSize,
>   maxOutputSize, limitedOutput, tableType, noDict,
> - noDictIssue, acceleration);
> + noDictIssue, acceleration, NoDynOffset);
>   }
>  }
>
> +static int LZ4_compress_fast_extState_dynamic(
> + void *state,
> + const char *source,
> + char *dest,
> + int inputSize,
> + int maxOutputSize,
> + int acceleration)
> +{
> + LZ4_stream_t_internal *ctx = &((LZ4_stream_t 
> *)state)->internal_donotuse;
> +
> + LZ4_resetStream((LZ4_stream_t *)state);
> +
> + if (acceleration < 1)
> +

Re: [PATCH 0/1] cover-letter/lz4: Implement lz4 with dynamic offset length.

2018-03-21 Thread Nick Terrell
On (03/21/18 10:10), Maninder Singh wrote:
> LZ4 specification defines 2 byte offset length for 64 KB data.
> But in case of ZRAM we compress data per page and in most of
> architecture PAGE_SIZE is 4KB. So we can decide offset length based
> on actual offset value. For this we can reserve 1 bit to decide offset
> length (1 byte or 2 byte). 2 byte required only if ofsset is greater than 127,
> else 1 byte is enough.
> 
> With this new implementation new offset value can be at MAX 32 KB.
> 
> Thus we can save more memory for compressed data.
> 
> results checked with new implementation:-
> 
> comression size for same input source
> (LZ4_DYN < LZO < LZ4)
> 
> LZO
> ===
> orig_data_size: 78917632
> compr_data_size: 15894668
> mem_used_total: 17117184
> 
> LZ4
> 
> orig_data_size: 78917632
> compr_data_size: 16310717
> mem_used_total: 17592320
> 
> LZ4_DYN
> ===
> orig_data_size: 78917632
> compr_data_size: 15520506
> mem_used_total: 16748544

This seems like a reasonable extension to the algorithm, and it looks like
LZ4_DYN is about a 5% improvement to compression ratio on your benchmark.
The biggest question I have is if it is worthwhile to maintain a separate
incompatible variant of LZ4 in the kernel without any upstream for a 5%
gain? If we do want to go forward with this, we should perform more
benchmarks.

I commented in the patch, but because the `dynOffset` variable isn't a
compile time static in LZ4_decompress_generic(), I suspect that the patch
causes a regression in decompression speed for both LZ4 and LZ4_DYN. You'll
need to re-run the benchmarks to first show that LZ4 before the patch
performs the same as LZ4 after the patch. Then re-run the LZ4 vs LZ4_DYN
benchmarks.

I would also like to see a benchmark in user-space (with the code), so we
can see the performance of LZ4 before and after the patch, as well as LZ4
vs LZ4_DYN without anything else going on. I expect the extra branches in
the decoding loop to have an impact on speed, and I would like to see how
big the impact is without noise. 

CC-ing Yann Collet, the author of LZ4



Re: [PATCH v13 00/10] Add io{read|write}64 to io-64-atomic headers

2018-03-21 Thread Logan Gunthorpe


On 21/03/18 11:40 AM, Andy Shevchenko wrote:
> The first patches that I didn't reviewed before makes sense to me, the
> rest (at the end) FWIW,
> 
> Reviewed-by: Andy Shevchenko 

Great, thanks!

Logan


Re: [PATCH v13 00/10] Add io{read|write}64 to io-64-atomic headers

2018-03-21 Thread Andy Shevchenko
On Wed, Mar 21, 2018 at 6:37 PM, Logan Gunthorpe  wrote:
>
> This is v13 of my cleanup series to push a number of instances of people
> defining their own io{read|write}64 functions into common headers seing
> they don't exist in non-64bit systems. This series adds inline functions to 
> the
> io-64-nonatomic headers and then cleans up the drivers that defined their
> own copies.
>
> This cleanup was originally requested by Greg after he reviewed my
> Switchtec NTB code. And I hope someone can pick it up or at least give
> feedback on it soon as it's been around relatively unchanged for a few
> cycles now and I'm getting a bit tired of resubmitting it with little to
> no interest.
>


The first patches that I didn't reviewed before makes sense to me, the
rest (at the end) FWIW,

Reviewed-by: Andy Shevchenko 

> Thanks,
>
> Logan
>
> --
>
> Changes since v12:
> - Rebased onto v4.16-rc6
> - Split patch 0001 into two and reworked the commit log as requested
>   by Luc Van Oostenryck
>
> Changes since v11:
> - Rebased onto v4.16-rc5
> - Added a patch (0001) to fix some old and new sparse warnings
>   that the kbuild robot warned about this cycle. The latest version
>   of sparse was required to reproduce these.
> - Added a patch (0002) to add io{read|write}64 to parisc which the kbuild
>   robot also found errors for this cycle
>
> Changes since v10:
> - Rebased onto v4.16-rc4, this droped the drm/tilcdc patch which was
>   picked up by that tree and is already in 4.16.
>
> Changes since v9:
> - Rebased onto v4.15-rc6
> - Fixed a couple of issues in the new version of the CAAM patch as
>   pointed out by Horia
>
> Changes since v8:
> - Rebased onto v4.15-rc2, as a result rewrote patch 7 seeing someone did
>   some similar cleanup in that area.
> - Added a patch to clean up the Switchtec NTB driver which landed in
>   v4.15-rc1
>
> Changes since v7:
> - Fix minor nits from Andy Shevchenko
> - Rebased onto v4.14-rc1
>
> Changes since v6:
>  ** none **
>
> Changes since v5:
> - Added a fix to the tilcdc driver to ensure it doesn't use the
>   non-atomic operation. (This includes adding 
> io{read|write}64[be]_is_nonatomic
>   defines).
>
> Changes since v4:
> - Add functions so the powerpc implementation of iomap.c compiles. (As
>   noticed by Horia)
>
> Changes since v3:
>
> - I noticed powerpc didn't use the appropriate functions seeing
>   readq/writeq were not defined when iomap.h was included. Thus I've
>   included a patch to adjust this
> - Fixed some mistakes with a couple of the defines in io-64-nonatomic*
>   headers
> - Fixed a typo noticed by Horia.
>
> (earlier versions were drastically different)
>
> --
>
> Logan Gunthorpe (10):
>   iomap: Use correct endian conversion function in mmio_writeXXbe
>   iomap: Fix sparse endian check warnings
>   parisc: iomap: introduce io{read|write}64
>   powerpc: io.h: move iomap.h include so that it can use readq/writeq
> defs
>   powerpc: iomap.c: introduce io{read|write}64_{lo_hi|hi_lo}
>   iomap: introduce io{read|write}64_{lo_hi|hi_lo}
>   io-64-nonatomic: add io{read|write}64[be]{_lo_hi|_hi_lo} macros
>   ntb: ntb_hw_intel: use io-64-nonatomic instead of in-driver hacks
>   crypto: caam: cleanup CONFIG_64BIT ifdefs when using io{read|write}64
>   ntb: ntb_hw_switchtec: Cleanup 64bit IO defines to use the common
> header
>
>  arch/parisc/include/asm/io.h   |   9 +++
>  arch/parisc/lib/iomap.c|  64 +++
>  arch/powerpc/include/asm/io.h  |   6 +-
>  arch/powerpc/kernel/iomap.c|  40 ++
>  drivers/crypto/caam/regs.h |  30 +--
>  drivers/ntb/hw/intel/ntb_hw_intel.c|  30 +--
>  drivers/ntb/hw/mscc/ntb_hw_switchtec.c |  36 +
>  include/asm-generic/iomap.h|  26 --
>  include/linux/io-64-nonatomic-hi-lo.h  |  64 +++
>  include/linux/io-64-nonatomic-lo-hi.h  |  64 +++
>  lib/iomap.c| 140 
> -
>  11 files changed, 409 insertions(+), 100 deletions(-)
>
> --
> 2.11.0



-- 
With Best Regards,
Andy Shevchenko


Re: [PATCH v13 01/10] iomap: Use correct endian conversion function in mmio_writeXXbe

2018-03-21 Thread Luc Van Oostenryck
On Wed, Mar 21, 2018 at 10:37:36AM -0600, Logan Gunthorpe wrote:
> The semantics of the iowriteXXbe() functions are to write a
> value in CPU endianess to an IO register that is known by the
> caller to be in Big Endian. The mmio_writeXXbe() macro, which
> is called by iowriteXXbe(), should therefore use cpu_to_beXX()
> instead of beXX_to_cpu().
> 
> Seeing both beXX_to_cpu() and cpu_to_beXX() are both functionally
> implemented as either null operations or swabXX operations there
> was no noticable bug here. But it is confusing for both developers
> and code analysis tools alike.
> 
> Signed-off-by: Logan Gunthorpe 
> Cc: Philippe Ombredanne 
> Cc: Thomas Gleixner 
> Cc: Kate Stewart 
> Cc: Greg Kroah-Hartman 
> Cc: Luc Van Oostenryck 
> ---
>  lib/iomap.c | 4 ++--
>  1 file changed, 2 insertions(+), 2 deletions(-)
> 
> diff --git a/lib/iomap.c b/lib/iomap.c
> index 541d926da95e..be120c13d6cc 100644
> --- a/lib/iomap.c
> +++ b/lib/iomap.c
> @@ -106,8 +106,8 @@ EXPORT_SYMBOL(ioread32be);
>  #endif
>  
>  #ifndef mmio_write16be
> -#define mmio_write16be(val,port) __raw_writew(be16_to_cpu(val),port)
> -#define mmio_write32be(val,port) __raw_writel(be32_to_cpu(val),port)
> +#define mmio_write16be(val,port) __raw_writew(cpu_to_be16(val),port)
> +#define mmio_write32be(val,port) __raw_writel(cpu_to_be32(val),port)
>  #endif
>  
>  void iowrite8(u8 val, void __iomem *addr)
> -- 
> 2.11.0
> 

LGTM, feel free to add my
Reviewed-by: Luc Van Oostenryck 

-- Luc


[PATCH 6/9] crypto: ixp4xx - don't leak pointers to authenc keys

2018-03-21 Thread Tudor Ambarus
Signed-off-by: Tudor Ambarus 
---
 drivers/crypto/ixp4xx_crypto.c | 2 ++
 1 file changed, 2 insertions(+)

diff --git a/drivers/crypto/ixp4xx_crypto.c b/drivers/crypto/ixp4xx_crypto.c
index 717a266..27f7dad 100644
--- a/drivers/crypto/ixp4xx_crypto.c
+++ b/drivers/crypto/ixp4xx_crypto.c
@@ -1167,9 +1167,11 @@ static int aead_setkey(struct crypto_aead *tfm, const u8 
*key,
ctx->authkey_len = keys.authkeylen;
ctx->enckey_len = keys.enckeylen;
 
+   memzero_explicit(, sizeof(keys));
return aead_setup(tfm, crypto_aead_authsize(tfm));
 badkey:
crypto_aead_set_flags(tfm, CRYPTO_TFM_RES_BAD_KEY_LEN);
+   memzero_explicit(, sizeof(keys));
return -EINVAL;
 }
 
-- 
2.9.4



[PATCH 7/9] crypto: picoxcell - don't leak pointers to authenc keys

2018-03-21 Thread Tudor Ambarus
Signed-off-by: Tudor Ambarus 
---
 drivers/crypto/picoxcell_crypto.c | 2 ++
 1 file changed, 2 insertions(+)

diff --git a/drivers/crypto/picoxcell_crypto.c 
b/drivers/crypto/picoxcell_crypto.c
index 4ef52c9..a4df966 100644
--- a/drivers/crypto/picoxcell_crypto.c
+++ b/drivers/crypto/picoxcell_crypto.c
@@ -499,10 +499,12 @@ static int spacc_aead_setkey(struct crypto_aead *tfm, 
const u8 *key,
memcpy(ctx->hash_ctx, keys.authkey, keys.authkeylen);
ctx->hash_key_len = keys.authkeylen;
 
+   memzero_explicit(, sizeof(keys));
return 0;
 
 badkey:
crypto_aead_set_flags(tfm, CRYPTO_TFM_RES_BAD_KEY_LEN);
+   memzero_explicit(, sizeof(keys));
return -EINVAL;
 }
 
-- 
2.9.4



[PATCH 2/9] crypto: authencesn - don't leak pointers to authenc keys

2018-03-21 Thread Tudor Ambarus
Signed-off-by: Tudor Ambarus 
---
 crypto/authencesn.c | 11 +--
 1 file changed, 5 insertions(+), 6 deletions(-)

diff --git a/crypto/authencesn.c b/crypto/authencesn.c
index 15f91dd..3bed6d1 100644
--- a/crypto/authencesn.c
+++ b/crypto/authencesn.c
@@ -69,8 +69,10 @@ static int crypto_authenc_esn_setkey(struct crypto_aead 
*authenc_esn, const u8 *
struct crypto_authenc_keys keys;
int err = -EINVAL;
 
-   if (crypto_authenc_extractkeys(, key, keylen) != 0)
-   goto badkey;
+   if (crypto_authenc_extractkeys(, key, keylen) != 0) {
+   crypto_aead_set_flags(authenc_esn, CRYPTO_TFM_RES_BAD_KEY_LEN);
+   goto out;
+   }
 
crypto_ahash_clear_flags(auth, CRYPTO_TFM_REQ_MASK);
crypto_ahash_set_flags(auth, crypto_aead_get_flags(authenc_esn) &
@@ -90,11 +92,8 @@ static int crypto_authenc_esn_setkey(struct crypto_aead 
*authenc_esn, const u8 *
   CRYPTO_TFM_RES_MASK);
 
 out:
+   memzero_explicit(, sizeof(keys));
return err;
-
-badkey:
-   crypto_aead_set_flags(authenc_esn, CRYPTO_TFM_RES_BAD_KEY_LEN);
-   goto out;
 }
 
 static int crypto_authenc_esn_genicv_tail(struct aead_request *req,
-- 
2.9.4



[PATCH 9/9] crypto: talitos - don't leak pointers to authenc keys

2018-03-21 Thread Tudor Ambarus
Signed-off-by: Tudor Ambarus 
---
 drivers/crypto/talitos.c | 2 ++
 1 file changed, 2 insertions(+)

diff --git a/drivers/crypto/talitos.c b/drivers/crypto/talitos.c
index 447cb8b..c92efc7 100644
--- a/drivers/crypto/talitos.c
+++ b/drivers/crypto/talitos.c
@@ -904,10 +904,12 @@ static int aead_setkey(struct crypto_aead *authenc,
ctx->dma_key = dma_map_single(dev, ctx->key, ctx->keylen,
  DMA_TO_DEVICE);
 
+   memzero_explicit(, sizeof(keys));
return 0;
 
 badkey:
crypto_aead_set_flags(authenc, CRYPTO_TFM_RES_BAD_KEY_LEN);
+   memzero_explicit(, sizeof(keys));
return -EINVAL;
 }
 
-- 
2.9.4



[PATCH 8/9] crypto: qat - don't leak pointers to authenc keys

2018-03-21 Thread Tudor Ambarus
Signed-off-by: Tudor Ambarus 
---
 drivers/crypto/qat/qat_common/qat_algs.c | 3 +++
 1 file changed, 3 insertions(+)

diff --git a/drivers/crypto/qat/qat_common/qat_algs.c 
b/drivers/crypto/qat/qat_common/qat_algs.c
index baffae8..1138e41 100644
--- a/drivers/crypto/qat/qat_common/qat_algs.c
+++ b/drivers/crypto/qat/qat_common/qat_algs.c
@@ -546,11 +546,14 @@ static int qat_alg_aead_init_sessions(struct crypto_aead 
*tfm, const u8 *key,
if (qat_alg_aead_init_dec_session(tfm, alg, , mode))
goto error;
 
+   memzero_explicit(, sizeof(keys));
return 0;
 bad_key:
crypto_aead_set_flags(tfm, CRYPTO_TFM_RES_BAD_KEY_LEN);
+   memzero_explicit(, sizeof(keys));
return -EINVAL;
 error:
+   memzero_explicit(, sizeof(keys));
return -EFAULT;
 }
 
-- 
2.9.4



[PATCH v1] crypto: Deduplicate le32_to_cpu_array() and cpu_to_le32_array()

2018-03-21 Thread Andy Shevchenko
Deduplicate le32_to_cpu_array() and cpu_to_le32_array() by moving them
to the generic header.

No functional change implied.

Signed-off-by: Andy Shevchenko 
---
 crypto/md4.c  | 17 -
 crypto/md5.c  | 17 -
 include/linux/byteorder/generic.h | 17 +
 3 files changed, 17 insertions(+), 34 deletions(-)

diff --git a/crypto/md4.c b/crypto/md4.c
index 3515af425cc9..810fefb0a007 100644
--- a/crypto/md4.c
+++ b/crypto/md4.c
@@ -64,23 +64,6 @@ static inline u32 H(u32 x, u32 y, u32 z)
 #define ROUND2(a,b,c,d,k,s) (a = lshift(a + G(b,c,d) + k + (u32)0x5A827999,s))
 #define ROUND3(a,b,c,d,k,s) (a = lshift(a + H(b,c,d) + k + (u32)0x6ED9EBA1,s))
 
-/* XXX: this stuff can be optimized */
-static inline void le32_to_cpu_array(u32 *buf, unsigned int words)
-{
-   while (words--) {
-   __le32_to_cpus(buf);
-   buf++;
-   }
-}
-
-static inline void cpu_to_le32_array(u32 *buf, unsigned int words)
-{
-   while (words--) {
-   __cpu_to_le32s(buf);
-   buf++;
-   }
-}
-
 static void md4_transform(u32 *hash, u32 const *in)
 {
u32 a, b, c, d;
diff --git a/crypto/md5.c b/crypto/md5.c
index f7ae1a48225b..f776ef43d621 100644
--- a/crypto/md5.c
+++ b/crypto/md5.c
@@ -32,23 +32,6 @@ const u8 md5_zero_message_hash[MD5_DIGEST_SIZE] = {
 };
 EXPORT_SYMBOL_GPL(md5_zero_message_hash);
 
-/* XXX: this stuff can be optimized */
-static inline void le32_to_cpu_array(u32 *buf, unsigned int words)
-{
-   while (words--) {
-   __le32_to_cpus(buf);
-   buf++;
-   }
-}
-
-static inline void cpu_to_le32_array(u32 *buf, unsigned int words)
-{
-   while (words--) {
-   __cpu_to_le32s(buf);
-   buf++;
-   }
-}
-
 #define F1(x, y, z)(z ^ (x & (y ^ z)))
 #define F2(x, y, z)F1(z, x, y)
 #define F3(x, y, z)(x ^ y ^ z)
diff --git a/include/linux/byteorder/generic.h 
b/include/linux/byteorder/generic.h
index 451aaa0786ae..4b13e0a3e15b 100644
--- a/include/linux/byteorder/generic.h
+++ b/include/linux/byteorder/generic.h
@@ -156,6 +156,23 @@ static inline void le64_add_cpu(__le64 *var, u64 val)
*var = cpu_to_le64(le64_to_cpu(*var) + val);
 }
 
+/* XXX: this stuff can be optimized */
+static inline void le32_to_cpu_array(u32 *buf, unsigned int words)
+{
+   while (words--) {
+   __le32_to_cpus(buf);
+   buf++;
+   }
+}
+
+static inline void cpu_to_le32_array(u32 *buf, unsigned int words)
+{
+   while (words--) {
+   __cpu_to_le32s(buf);
+   buf++;
+   }
+}
+
 static inline void be16_add_cpu(__be16 *var, u16 val)
 {
*var = cpu_to_be16(be16_to_cpu(*var) + val);
-- 
2.16.2



[PATCH 0/9] don't leak pointers to authenc keys

2018-03-21 Thread Tudor Ambarus
There are few places in crypto where we save pointers to the
authenc keys to a local variable of type struct crypto_authenc_keys
and we don't zeroize it after use. Fix all those cases and don't
leak pointers to the authenc keys.

Tudor Ambarus (9):
  crypto: authenc - don't leak pointers to authenc keys
  crypto: authencesn - don't leak pointers to authenc keys
  crypto: caam - don't leak pointers to authenc keys
  crypto: caam/qi - don't leak pointers to authenc keys
  crypto: chcr - don't leak pointers to authenc keys
  crypto: ixp4xx - don't leak pointers to authenc keys
  crypto: picoxcell - don't leak pointers to authenc keys
  crypto: qat - don't leak pointers to authenc keys
  crypto: talitos - don't leak pointers to authenc keys

 crypto/authenc.c | 11 +--
 crypto/authencesn.c  | 11 +--
 drivers/crypto/caam/caamalg.c|  2 ++
 drivers/crypto/caam/caamalg_qi.c |  2 ++
 drivers/crypto/chelsio/chcr_algo.c   |  5 +
 drivers/crypto/ixp4xx_crypto.c   |  2 ++
 drivers/crypto/picoxcell_crypto.c|  2 ++
 drivers/crypto/qat/qat_common/qat_algs.c |  3 +++
 drivers/crypto/talitos.c |  2 ++
 9 files changed, 28 insertions(+), 12 deletions(-)

-- 
2.9.4



[PATCH 4/9] crypto: caam/qi - don't leak pointers to authenc keys

2018-03-21 Thread Tudor Ambarus
Signed-off-by: Tudor Ambarus 
---
 drivers/crypto/caam/caamalg_qi.c | 2 ++
 1 file changed, 2 insertions(+)

diff --git a/drivers/crypto/caam/caamalg_qi.c b/drivers/crypto/caam/caamalg_qi.c
index c2b5762..cacda08 100644
--- a/drivers/crypto/caam/caamalg_qi.c
+++ b/drivers/crypto/caam/caamalg_qi.c
@@ -278,9 +278,11 @@ static int aead_setkey(struct crypto_aead *aead, const u8 
*key,
}
}
 
+   memzero_explicit(, sizeof(keys));
return ret;
 badkey:
crypto_aead_set_flags(aead, CRYPTO_TFM_RES_BAD_KEY_LEN);
+   memzero_explicit(, sizeof(keys));
return -EINVAL;
 }
 
-- 
2.9.4



[PATCH 5/9] crypto: chcr - don't leak pointers to authenc keys

2018-03-21 Thread Tudor Ambarus
Signed-off-by: Tudor Ambarus 
---
 drivers/crypto/chelsio/chcr_algo.c | 5 +
 1 file changed, 5 insertions(+)

diff --git a/drivers/crypto/chelsio/chcr_algo.c 
b/drivers/crypto/chelsio/chcr_algo.c
index 4617c7a..91bc77a 100644
--- a/drivers/crypto/chelsio/chcr_algo.c
+++ b/drivers/crypto/chelsio/chcr_algo.c
@@ -3457,6 +3457,7 @@ static int chcr_authenc_setkey(struct crypto_aead 
*authenc, const u8 *key,
if (IS_ERR(base_hash)) {
pr_err("chcr : Base driver cannot be loaded\n");
aeadctx->enckey_len = 0;
+   memzero_explicit(, sizeof(keys));
return -EINVAL;
}
{
@@ -3508,10 +3509,12 @@ static int chcr_authenc_setkey(struct crypto_aead 
*authenc, const u8 *key,
actx->auth_mode = param.auth_mode;
chcr_free_shash(base_hash);
 
+   memzero_explicit(, sizeof(keys));
return 0;
}
 out:
aeadctx->enckey_len = 0;
+   memzero_explicit(, sizeof(keys));
if (!IS_ERR(base_hash))
chcr_free_shash(base_hash);
return -EINVAL;
@@ -3574,9 +3577,11 @@ static int chcr_aead_digest_null_setkey(struct 
crypto_aead *authenc,
aeadctx->key_ctx_hdr = FILL_KEY_CTX_HDR(ck_size, CHCR_KEYCTX_NO_KEY, 0,
0, key_ctx_len >> 4);
actx->auth_mode = CHCR_SCMD_AUTH_MODE_NOP;
+   memzero_explicit(, sizeof(keys));
return 0;
 out:
aeadctx->enckey_len = 0;
+   memzero_explicit(, sizeof(keys));
return -EINVAL;
 }
 
-- 
2.9.4



[PATCH 3/9] crypto: caam - don't leak pointers to authenc keys

2018-03-21 Thread Tudor Ambarus
Signed-off-by: Tudor Ambarus 
---
 drivers/crypto/caam/caamalg.c | 2 ++
 1 file changed, 2 insertions(+)

diff --git a/drivers/crypto/caam/caamalg.c b/drivers/crypto/caam/caamalg.c
index 584a6c1..7207a53 100644
--- a/drivers/crypto/caam/caamalg.c
+++ b/drivers/crypto/caam/caamalg.c
@@ -570,9 +570,11 @@ static int aead_setkey(struct crypto_aead *aead,
 
 skip_split_key:
ctx->cdata.keylen = keys.enckeylen;
+   memzero_explicit(, sizeof(keys));
return aead_set_sh_desc(aead);
 badkey:
crypto_aead_set_flags(aead, CRYPTO_TFM_RES_BAD_KEY_LEN);
+   memzero_explicit(, sizeof(keys));
return -EINVAL;
 }
 
-- 
2.9.4



[PATCH 1/9] crypto: authenc - don't leak pointers to authenc keys

2018-03-21 Thread Tudor Ambarus
Signed-off-by: Tudor Ambarus 
---
 crypto/authenc.c | 11 +--
 1 file changed, 5 insertions(+), 6 deletions(-)

diff --git a/crypto/authenc.c b/crypto/authenc.c
index d3d6d72..480a08b 100644
--- a/crypto/authenc.c
+++ b/crypto/authenc.c
@@ -87,8 +87,10 @@ static int crypto_authenc_setkey(struct crypto_aead 
*authenc, const u8 *key,
struct crypto_authenc_keys keys;
int err = -EINVAL;
 
-   if (crypto_authenc_extractkeys(, key, keylen) != 0)
-   goto badkey;
+   if (crypto_authenc_extractkeys(, key, keylen) != 0) {
+   crypto_aead_set_flags(authenc, CRYPTO_TFM_RES_BAD_KEY_LEN);
+   goto out;
+   }
 
crypto_ahash_clear_flags(auth, CRYPTO_TFM_REQ_MASK);
crypto_ahash_set_flags(auth, crypto_aead_get_flags(authenc) &
@@ -108,11 +110,8 @@ static int crypto_authenc_setkey(struct crypto_aead 
*authenc, const u8 *key,
   CRYPTO_TFM_RES_MASK);
 
 out:
+   memzero_explicit(, sizeof(keys));
return err;
-
-badkey:
-   crypto_aead_set_flags(authenc, CRYPTO_TFM_RES_BAD_KEY_LEN);
-   goto out;
 }
 
 static void authenc_geniv_ahash_done(struct crypto_async_request *areq, int 
err)
-- 
2.9.4



[PATCH v13 09/10] crypto: caam: cleanup CONFIG_64BIT ifdefs when using io{read|write}64

2018-03-21 Thread Logan Gunthorpe
Clean up the extra ifdefs which defined the wr_reg64 and rd_reg64
functions in non-64bit cases in favour of the new common
io-64-nonatomic-lo-hi header.

To be consistent with CAAM engine HW spec: in case of 64-bit registers,
irrespective of device endianness, the lower address should be read from
/ written to first, followed by the upper address. Indeed the I/O
accessors in CAAM driver currently don't follow the spec, however this
is a good opportunity to fix the code.

Signed-off-by: Logan Gunthorpe 
Reviewed-by: Horia Geantă 
Cc: Andy Shevchenko 
Cc: Dan Douglass 
Cc: Herbert Xu 
Cc: "David S. Miller" 
---
 drivers/crypto/caam/regs.h | 30 +++---
 1 file changed, 3 insertions(+), 27 deletions(-)

diff --git a/drivers/crypto/caam/regs.h b/drivers/crypto/caam/regs.h
index fee363865d88..f887b371040f 100644
--- a/drivers/crypto/caam/regs.h
+++ b/drivers/crypto/caam/regs.h
@@ -10,7 +10,7 @@
 
 #include 
 #include 
-#include 
+#include 
 
 /*
  * Architecture-specific register access methods
@@ -136,10 +136,9 @@ static inline void clrsetbits_32(void __iomem *reg, u32 
clear, u32 set)
  *base + 0x : least-significant 32 bits
  *base + 0x0004 : most-significant 32 bits
  */
-#ifdef CONFIG_64BIT
 static inline void wr_reg64(void __iomem *reg, u64 data)
 {
-   if (caam_little_end)
+   if (!caam_imx && caam_little_end)
iowrite64(data, reg);
else
iowrite64be(data, reg);
@@ -147,35 +146,12 @@ static inline void wr_reg64(void __iomem *reg, u64 data)
 
 static inline u64 rd_reg64(void __iomem *reg)
 {
-   if (caam_little_end)
+   if (!caam_imx && caam_little_end)
return ioread64(reg);
else
return ioread64be(reg);
 }
 
-#else /* CONFIG_64BIT */
-static inline void wr_reg64(void __iomem *reg, u64 data)
-{
-   if (!caam_imx && caam_little_end) {
-   wr_reg32((u32 __iomem *)(reg) + 1, data >> 32);
-   wr_reg32((u32 __iomem *)(reg), data);
-   } else {
-   wr_reg32((u32 __iomem *)(reg), data >> 32);
-   wr_reg32((u32 __iomem *)(reg) + 1, data);
-   }
-}
-
-static inline u64 rd_reg64(void __iomem *reg)
-{
-   if (!caam_imx && caam_little_end)
-   return ((u64)rd_reg32((u32 __iomem *)(reg) + 1) << 32 |
-   (u64)rd_reg32((u32 __iomem *)(reg)));
-
-   return ((u64)rd_reg32((u32 __iomem *)(reg)) << 32 |
-   (u64)rd_reg32((u32 __iomem *)(reg) + 1));
-}
-#endif /* CONFIG_64BIT  */
-
 static inline u64 cpu_to_caam_dma64(dma_addr_t value)
 {
if (caam_imx)
-- 
2.11.0



[PATCH v13 06/10] iomap: introduce io{read|write}64_{lo_hi|hi_lo}

2018-03-21 Thread Logan Gunthorpe
In order to provide non-atomic functions for io{read|write}64 that will
use readq and writeq when appropriate. We define a number of variants
of these functions in the generic iomap that will do non-atomic
operations on pio but atomic operations on mmio.

These functions are only defined if readq and writeq are defined. If
they are not, then the wrappers that always use non-atomic operations
from include/linux/io-64-nonatomic*.h will be used.

Signed-off-by: Logan Gunthorpe 
Reviewed-by: Andy Shevchenko 
Cc: Benjamin Herrenschmidt 
Cc: Paul Mackerras 
Cc: Michael Ellerman 
Cc: Arnd Bergmann 
Cc: Suresh Warrier 
Cc: Nicholas Piggin 
---
 arch/powerpc/include/asm/io.h |   2 +
 include/asm-generic/iomap.h   |  26 +++--
 lib/iomap.c   | 132 ++
 3 files changed, 154 insertions(+), 6 deletions(-)

diff --git a/arch/powerpc/include/asm/io.h b/arch/powerpc/include/asm/io.h
index af074923d598..4cc420cfaa78 100644
--- a/arch/powerpc/include/asm/io.h
+++ b/arch/powerpc/include/asm/io.h
@@ -788,8 +788,10 @@ extern void __iounmap_at(void *ea, unsigned long size);
 
 #define mmio_read16be(addr)readw_be(addr)
 #define mmio_read32be(addr)readl_be(addr)
+#define mmio_read64be(addr)readq_be(addr)
 #define mmio_write16be(val, addr)  writew_be(val, addr)
 #define mmio_write32be(val, addr)  writel_be(val, addr)
+#define mmio_write64be(val, addr)  writeq_be(val, addr)
 #define mmio_insb(addr, dst, count)readsb(addr, dst, count)
 #define mmio_insw(addr, dst, count)readsw(addr, dst, count)
 #define mmio_insl(addr, dst, count)readsl(addr, dst, count)
diff --git a/include/asm-generic/iomap.h b/include/asm-generic/iomap.h
index 5b63b94ef6b5..5a4af0199b32 100644
--- a/include/asm-generic/iomap.h
+++ b/include/asm-generic/iomap.h
@@ -31,9 +31,16 @@ extern unsigned int ioread16(void __iomem *);
 extern unsigned int ioread16be(void __iomem *);
 extern unsigned int ioread32(void __iomem *);
 extern unsigned int ioread32be(void __iomem *);
-#ifdef CONFIG_64BIT
-extern u64 ioread64(void __iomem *);
-extern u64 ioread64be(void __iomem *);
+
+#ifdef readq
+#define ioread64_lo_hi ioread64_lo_hi
+#define ioread64_hi_lo ioread64_hi_lo
+#define ioread64be_lo_hi ioread64be_lo_hi
+#define ioread64be_hi_lo ioread64be_hi_lo
+extern u64 ioread64_lo_hi(void __iomem *addr);
+extern u64 ioread64_hi_lo(void __iomem *addr);
+extern u64 ioread64be_lo_hi(void __iomem *addr);
+extern u64 ioread64be_hi_lo(void __iomem *addr);
 #endif
 
 extern void iowrite8(u8, void __iomem *);
@@ -41,9 +48,16 @@ extern void iowrite16(u16, void __iomem *);
 extern void iowrite16be(u16, void __iomem *);
 extern void iowrite32(u32, void __iomem *);
 extern void iowrite32be(u32, void __iomem *);
-#ifdef CONFIG_64BIT
-extern void iowrite64(u64, void __iomem *);
-extern void iowrite64be(u64, void __iomem *);
+
+#ifdef writeq
+#define iowrite64_lo_hi iowrite64_lo_hi
+#define iowrite64_hi_lo iowrite64_hi_lo
+#define iowrite64be_lo_hi iowrite64be_lo_hi
+#define iowrite64be_hi_lo iowrite64be_hi_lo
+extern void iowrite64_lo_hi(u64 val, void __iomem *addr);
+extern void iowrite64_hi_lo(u64 val, void __iomem *addr);
+extern void iowrite64be_lo_hi(u64 val, void __iomem *addr);
+extern void iowrite64be_hi_lo(u64 val, void __iomem *addr);
 #endif
 
 /*
diff --git a/lib/iomap.c b/lib/iomap.c
index 44645c4b516c..f3b7b2e4c625 100644
--- a/lib/iomap.c
+++ b/lib/iomap.c
@@ -67,6 +67,7 @@ static void bad_io_access(unsigned long port, const char 
*access)
 #ifndef mmio_read16be
 #define mmio_read16be(addr) be16_to_cpu((__be16 __force)__raw_readw(addr))
 #define mmio_read32be(addr) be32_to_cpu((__be32 __force)__raw_readl(addr))
+#define mmio_read64be(addr) be64_to_cpu((__be64 __force)__raw_readq(addr))
 #endif
 
 unsigned int ioread8(void __iomem *addr)
@@ -100,6 +101,80 @@ EXPORT_SYMBOL(ioread16be);
 EXPORT_SYMBOL(ioread32);
 EXPORT_SYMBOL(ioread32be);
 
+#ifdef readq
+static u64 pio_read64_lo_hi(unsigned long port)
+{
+   u64 lo, hi;
+
+   lo = inl(port);
+   hi = inl(port + sizeof(u32));
+
+   return lo | (hi << 32);
+}
+
+static u64 pio_read64_hi_lo(unsigned long port)
+{
+   u64 lo, hi;
+
+   hi = inl(port + sizeof(u32));
+   lo = inl(port);
+
+   return lo | (hi << 32);
+}
+
+static u64 pio_read64be_lo_hi(unsigned long port)
+{
+   u64 lo, hi;
+
+   lo = pio_read32be(port + sizeof(u32));
+   hi = pio_read32be(port);
+
+   return lo | (hi << 32);
+}
+
+static u64 pio_read64be_hi_lo(unsigned long port)
+{
+   u64 lo, hi;
+
+   hi = pio_read32be(port);
+   lo = pio_read32be(port + sizeof(u32));
+
+   return lo | (hi << 32);
+}
+
+u64 ioread64_lo_hi(void __iomem *addr)
+{
+   IO_COND(addr, return pio_read64_lo_hi(port), return 

[PATCH v13 08/10] ntb: ntb_hw_intel: use io-64-nonatomic instead of in-driver hacks

2018-03-21 Thread Logan Gunthorpe
Now that ioread64 and iowrite64 are available in io-64-nonatomic,
we can remove the hack at the top of ntb_hw_intel.c and replace it
with an include.

Signed-off-by: Logan Gunthorpe 
Reviewed-by: Andy Shevchenko 
Acked-by: Dave Jiang 
Acked-by: Allen Hubbe 
Acked-by: Jon Mason 
---
 drivers/ntb/hw/intel/ntb_hw_intel.c | 30 +-
 1 file changed, 1 insertion(+), 29 deletions(-)

diff --git a/drivers/ntb/hw/intel/ntb_hw_intel.c 
b/drivers/ntb/hw/intel/ntb_hw_intel.c
index 156b45cd4a19..5cf40ab21366 100644
--- a/drivers/ntb/hw/intel/ntb_hw_intel.c
+++ b/drivers/ntb/hw/intel/ntb_hw_intel.c
@@ -59,6 +59,7 @@
 #include 
 #include 
 #include 
+#include 
 
 #include "ntb_hw_intel.h"
 
@@ -149,35 +150,6 @@ MODULE_PARM_DESC(xeon_b2b_dsd_bar5_addr32,
 static inline enum ntb_topo xeon_ppd_topo(struct intel_ntb_dev *ndev, u8 ppd);
 static int xeon_init_isr(struct intel_ntb_dev *ndev);
 
-#ifndef ioread64
-#ifdef readq
-#define ioread64 readq
-#else
-#define ioread64 _ioread64
-static inline u64 _ioread64(void __iomem *mmio)
-{
-   u64 low, high;
-
-   low = ioread32(mmio);
-   high = ioread32(mmio + sizeof(u32));
-   return low | (high << 32);
-}
-#endif
-#endif
-
-#ifndef iowrite64
-#ifdef writeq
-#define iowrite64 writeq
-#else
-#define iowrite64 _iowrite64
-static inline void _iowrite64(u64 val, void __iomem *mmio)
-{
-   iowrite32(val, mmio);
-   iowrite32(val >> 32, mmio + sizeof(u32));
-}
-#endif
-#endif
-
 static inline int pdev_is_xeon(struct pci_dev *pdev)
 {
switch (pdev->device) {
-- 
2.11.0



[PATCH v13 02/10] iomap: Fix sparse endian check warnings

2018-03-21 Thread Logan Gunthorpe
Newer version of sparse produce a few warnings of the form:

lib/iomap.c:84:9: warning: cast to restricted __be16

when run with:

make C=2 CF=-D__CHECK_ENDIAN__

(The kbuild robot has recently started running such checks)

The warning is not valid because the __raw_readX() and __raw_writeX()
functions have an endianess determined by the semantics of whichever
register they are operating on (which is intern dependant on the address
passed to the function).

In the iomap case, the register being operated on is known,
semantically, to be Big Endian when an ioXXbe() function is used
by the caller. These functions wrap a raw read or write function
in an endianness conversion function.

Sparse enforces that all values that aren't in CPU endianness
be marked with types like __be16 and similar and said types cannot
be mixed with other types. However, per above, the raw functions
cannot be marked as such seeing the endianness is indeterminate.
Thus, the output of __raw_readX() and the input of __raw_writeX()
must be cast with the __force keyword to supress the invalid warning.

Signed-off-by: Logan Gunthorpe 
Cc: Thomas Gleixner 
Cc: Greg Kroah-Hartman 
Cc: Philippe Ombredanne 
Cc: Kate Stewart 
Cc: Luc Van Oostenryck 
---
 lib/iomap.c | 8 
 1 file changed, 4 insertions(+), 4 deletions(-)

diff --git a/lib/iomap.c b/lib/iomap.c
index be120c13d6cc..44645c4b516c 100644
--- a/lib/iomap.c
+++ b/lib/iomap.c
@@ -65,8 +65,8 @@ static void bad_io_access(unsigned long port, const char 
*access)
 #endif
 
 #ifndef mmio_read16be
-#define mmio_read16be(addr) be16_to_cpu(__raw_readw(addr))
-#define mmio_read32be(addr) be32_to_cpu(__raw_readl(addr))
+#define mmio_read16be(addr) be16_to_cpu((__be16 __force)__raw_readw(addr))
+#define mmio_read32be(addr) be32_to_cpu((__be32 __force)__raw_readl(addr))
 #endif
 
 unsigned int ioread8(void __iomem *addr)
@@ -106,8 +106,8 @@ EXPORT_SYMBOL(ioread32be);
 #endif
 
 #ifndef mmio_write16be
-#define mmio_write16be(val,port) __raw_writew(cpu_to_be16(val),port)
-#define mmio_write32be(val,port) __raw_writel(cpu_to_be32(val),port)
+#define mmio_write16be(val,port) __raw_writew((u16 
__force)cpu_to_be16(val),port)
+#define mmio_write32be(val,port) __raw_writel((u32 
__force)cpu_to_be32(val),port)
 #endif
 
 void iowrite8(u8 val, void __iomem *addr)
-- 
2.11.0



[PATCH v13 07/10] io-64-nonatomic: add io{read|write}64[be]{_lo_hi|_hi_lo} macros

2018-03-21 Thread Logan Gunthorpe
This patch adds generic io{read|write}64[be]{_lo_hi|_hi_lo} macros if
they are not already defined by the architecture. (As they are provided
by the generic iomap library).

The patch also points io{read|write}64[be] to the variant specified by the
header name.

This is because new drivers are encouraged to use ioreadXX, et al instead
of readX[1], et al -- and mixing ioreadXX with readq is pretty ugly.

[1] LDD3: section 9.4.2

Signed-off-by: Logan Gunthorpe 
Reviewed-by: Andy Shevchenko 
Cc: Christoph Hellwig 
Cc: Arnd Bergmann 
Cc: Alan Cox 
Cc: Greg Kroah-Hartman 
---
 include/linux/io-64-nonatomic-hi-lo.h | 64 +++
 include/linux/io-64-nonatomic-lo-hi.h | 64 +++
 2 files changed, 128 insertions(+)

diff --git a/include/linux/io-64-nonatomic-hi-lo.h 
b/include/linux/io-64-nonatomic-hi-lo.h
index 862d786a904f..ae21b72cce85 100644
--- a/include/linux/io-64-nonatomic-hi-lo.h
+++ b/include/linux/io-64-nonatomic-hi-lo.h
@@ -55,4 +55,68 @@ static inline void hi_lo_writeq_relaxed(__u64 val, volatile 
void __iomem *addr)
 #define writeq_relaxed hi_lo_writeq_relaxed
 #endif
 
+#ifndef ioread64_hi_lo
+#define ioread64_hi_lo ioread64_hi_lo
+static inline u64 ioread64_hi_lo(void __iomem *addr)
+{
+   u32 low, high;
+
+   high = ioread32(addr + sizeof(u32));
+   low = ioread32(addr);
+
+   return low + ((u64)high << 32);
+}
+#endif
+
+#ifndef iowrite64_hi_lo
+#define iowrite64_hi_lo iowrite64_hi_lo
+static inline void iowrite64_hi_lo(u64 val, void __iomem *addr)
+{
+   iowrite32(val >> 32, addr + sizeof(u32));
+   iowrite32(val, addr);
+}
+#endif
+
+#ifndef ioread64be_hi_lo
+#define ioread64be_hi_lo ioread64be_hi_lo
+static inline u64 ioread64be_hi_lo(void __iomem *addr)
+{
+   u32 low, high;
+
+   high = ioread32be(addr);
+   low = ioread32be(addr + sizeof(u32));
+
+   return low + ((u64)high << 32);
+}
+#endif
+
+#ifndef iowrite64be_hi_lo
+#define iowrite64be_hi_lo iowrite64be_hi_lo
+static inline void iowrite64be_hi_lo(u64 val, void __iomem *addr)
+{
+   iowrite32be(val >> 32, addr);
+   iowrite32be(val, addr + sizeof(u32));
+}
+#endif
+
+#ifndef ioread64
+#define ioread64_is_nonatomic
+#define ioread64 ioread64_hi_lo
+#endif
+
+#ifndef iowrite64
+#define iowrite64_is_nonatomic
+#define iowrite64 iowrite64_hi_lo
+#endif
+
+#ifndef ioread64be
+#define ioread64be_is_nonatomic
+#define ioread64be ioread64be_hi_lo
+#endif
+
+#ifndef iowrite64be
+#define iowrite64be_is_nonatomic
+#define iowrite64be iowrite64be_hi_lo
+#endif
+
 #endif /* _LINUX_IO_64_NONATOMIC_HI_LO_H_ */
diff --git a/include/linux/io-64-nonatomic-lo-hi.h 
b/include/linux/io-64-nonatomic-lo-hi.h
index d042e7bb5adb..faaa842dbdb9 100644
--- a/include/linux/io-64-nonatomic-lo-hi.h
+++ b/include/linux/io-64-nonatomic-lo-hi.h
@@ -55,4 +55,68 @@ static inline void lo_hi_writeq_relaxed(__u64 val, volatile 
void __iomem *addr)
 #define writeq_relaxed lo_hi_writeq_relaxed
 #endif
 
+#ifndef ioread64_lo_hi
+#define ioread64_lo_hi ioread64_lo_hi
+static inline u64 ioread64_lo_hi(void __iomem *addr)
+{
+   u32 low, high;
+
+   low = ioread32(addr);
+   high = ioread32(addr + sizeof(u32));
+
+   return low + ((u64)high << 32);
+}
+#endif
+
+#ifndef iowrite64_lo_hi
+#define iowrite64_lo_hi iowrite64_lo_hi
+static inline void iowrite64_lo_hi(u64 val, void __iomem *addr)
+{
+   iowrite32(val, addr);
+   iowrite32(val >> 32, addr + sizeof(u32));
+}
+#endif
+
+#ifndef ioread64be_lo_hi
+#define ioread64be_lo_hi ioread64be_lo_hi
+static inline u64 ioread64be_lo_hi(void __iomem *addr)
+{
+   u32 low, high;
+
+   low = ioread32be(addr + sizeof(u32));
+   high = ioread32be(addr);
+
+   return low + ((u64)high << 32);
+}
+#endif
+
+#ifndef iowrite64be_lo_hi
+#define iowrite64be_lo_hi iowrite64be_lo_hi
+static inline void iowrite64be_lo_hi(u64 val, void __iomem *addr)
+{
+   iowrite32be(val, addr + sizeof(u32));
+   iowrite32be(val >> 32, addr);
+}
+#endif
+
+#ifndef ioread64
+#define ioread64_is_nonatomic
+#define ioread64 ioread64_lo_hi
+#endif
+
+#ifndef iowrite64
+#define iowrite64_is_nonatomic
+#define iowrite64 iowrite64_lo_hi
+#endif
+
+#ifndef ioread64be
+#define ioread64be_is_nonatomic
+#define ioread64be ioread64be_lo_hi
+#endif
+
+#ifndef iowrite64be
+#define iowrite64be_is_nonatomic
+#define iowrite64be iowrite64be_lo_hi
+#endif
+
 #endif /* _LINUX_IO_64_NONATOMIC_LO_HI_H_ */
-- 
2.11.0



[PATCH v13 10/10] ntb: ntb_hw_switchtec: Cleanup 64bit IO defines to use the common header

2018-03-21 Thread Logan Gunthorpe
Clean up the ifdefs which conditionally defined the io{read|write}64
functions in favour of the new common io-64-nonatomic-lo-hi header.

Per a nit from Andy Shevchenko, the include list is also made
alphabetical.

Signed-off-by: Logan Gunthorpe 
Cc: Jon Mason 
---
 drivers/ntb/hw/mscc/ntb_hw_switchtec.c | 36 --
 1 file changed, 4 insertions(+), 32 deletions(-)

diff --git a/drivers/ntb/hw/mscc/ntb_hw_switchtec.c 
b/drivers/ntb/hw/mscc/ntb_hw_switchtec.c
index f624ae27eabe..f403da24b833 100644
--- a/drivers/ntb/hw/mscc/ntb_hw_switchtec.c
+++ b/drivers/ntb/hw/mscc/ntb_hw_switchtec.c
@@ -13,12 +13,13 @@
  *
  */
 
-#include 
-#include 
+#include 
+#include 
 #include 
 #include 
-#include 
+#include 
 #include 
+#include 
 
 MODULE_DESCRIPTION("Microsemi Switchtec(tm) NTB Driver");
 MODULE_VERSION("0.1");
@@ -35,35 +36,6 @@ module_param(use_lut_mws, bool, 0644);
 MODULE_PARM_DESC(use_lut_mws,
 "Enable the use of the LUT based memory windows");
 
-#ifndef ioread64
-#ifdef readq
-#define ioread64 readq
-#else
-#define ioread64 _ioread64
-static inline u64 _ioread64(void __iomem *mmio)
-{
-   u64 low, high;
-
-   low = ioread32(mmio);
-   high = ioread32(mmio + sizeof(u32));
-   return low | (high << 32);
-}
-#endif
-#endif
-
-#ifndef iowrite64
-#ifdef writeq
-#define iowrite64 writeq
-#else
-#define iowrite64 _iowrite64
-static inline void _iowrite64(u64 val, void __iomem *mmio)
-{
-   iowrite32(val, mmio);
-   iowrite32(val >> 32, mmio + sizeof(u32));
-}
-#endif
-#endif
-
 #define SWITCHTEC_NTB_MAGIC 0x45CC0001
 #define MAX_MWS 128
 
-- 
2.11.0



[PATCH v13 05/10] powerpc: iomap.c: introduce io{read|write}64_{lo_hi|hi_lo}

2018-03-21 Thread Logan Gunthorpe
These functions will be introduced into the generic iomap.c so
they can deal with PIO accesses in hi-lo/lo-hi variants. Thus,
the powerpc version of iomap.c will need to provide the same
functions even though, in this arch, they are identical to the
regular io{read|write}64 functions.

Signed-off-by: Logan Gunthorpe 
Tested-by: Horia Geantă 
Reviewed-by: Andy Shevchenko 
Cc: Benjamin Herrenschmidt 
Cc: Paul Mackerras 
Cc: Michael Ellerman 
---
 arch/powerpc/kernel/iomap.c | 40 
 1 file changed, 40 insertions(+)

diff --git a/arch/powerpc/kernel/iomap.c b/arch/powerpc/kernel/iomap.c
index aab456ed2a00..5ac84efc6ede 100644
--- a/arch/powerpc/kernel/iomap.c
+++ b/arch/powerpc/kernel/iomap.c
@@ -45,12 +45,32 @@ u64 ioread64(void __iomem *addr)
 {
return readq(addr);
 }
+u64 ioread64_lo_hi(void __iomem *addr)
+{
+   return readq(addr);
+}
+u64 ioread64_hi_lo(void __iomem *addr)
+{
+   return readq(addr);
+}
 u64 ioread64be(void __iomem *addr)
 {
return readq_be(addr);
 }
+u64 ioread64be_lo_hi(void __iomem *addr)
+{
+   return readq_be(addr);
+}
+u64 ioread64be_hi_lo(void __iomem *addr)
+{
+   return readq_be(addr);
+}
 EXPORT_SYMBOL(ioread64);
+EXPORT_SYMBOL(ioread64_lo_hi);
+EXPORT_SYMBOL(ioread64_hi_lo);
 EXPORT_SYMBOL(ioread64be);
+EXPORT_SYMBOL(ioread64be_lo_hi);
+EXPORT_SYMBOL(ioread64be_hi_lo);
 #endif /* __powerpc64__ */
 
 void iowrite8(u8 val, void __iomem *addr)
@@ -83,12 +103,32 @@ void iowrite64(u64 val, void __iomem *addr)
 {
writeq(val, addr);
 }
+void iowrite64_lo_hi(u64 val, void __iomem *addr)
+{
+   writeq(val, addr);
+}
+void iowrite64_hi_lo(u64 val, void __iomem *addr)
+{
+   writeq(val, addr);
+}
 void iowrite64be(u64 val, void __iomem *addr)
 {
writeq_be(val, addr);
 }
+void iowrite64be_lo_hi(u64 val, void __iomem *addr)
+{
+   writeq_be(val, addr);
+}
+void iowrite64be_hi_lo(u64 val, void __iomem *addr)
+{
+   writeq_be(val, addr);
+}
 EXPORT_SYMBOL(iowrite64);
+EXPORT_SYMBOL(iowrite64_lo_hi);
+EXPORT_SYMBOL(iowrite64_hi_lo);
 EXPORT_SYMBOL(iowrite64be);
+EXPORT_SYMBOL(iowrite64be_lo_hi);
+EXPORT_SYMBOL(iowrite64be_hi_lo);
 #endif /* __powerpc64__ */
 
 /*
-- 
2.11.0



[PATCH v13 01/10] iomap: Use correct endian conversion function in mmio_writeXXbe

2018-03-21 Thread Logan Gunthorpe
The semantics of the iowriteXXbe() functions are to write a
value in CPU endianess to an IO register that is known by the
caller to be in Big Endian. The mmio_writeXXbe() macro, which
is called by iowriteXXbe(), should therefore use cpu_to_beXX()
instead of beXX_to_cpu().

Seeing both beXX_to_cpu() and cpu_to_beXX() are both functionally
implemented as either null operations or swabXX operations there
was no noticable bug here. But it is confusing for both developers
and code analysis tools alike.

Signed-off-by: Logan Gunthorpe 
Cc: Philippe Ombredanne 
Cc: Thomas Gleixner 
Cc: Kate Stewart 
Cc: Greg Kroah-Hartman 
Cc: Luc Van Oostenryck 
---
 lib/iomap.c | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/lib/iomap.c b/lib/iomap.c
index 541d926da95e..be120c13d6cc 100644
--- a/lib/iomap.c
+++ b/lib/iomap.c
@@ -106,8 +106,8 @@ EXPORT_SYMBOL(ioread32be);
 #endif
 
 #ifndef mmio_write16be
-#define mmio_write16be(val,port) __raw_writew(be16_to_cpu(val),port)
-#define mmio_write32be(val,port) __raw_writel(be32_to_cpu(val),port)
+#define mmio_write16be(val,port) __raw_writew(cpu_to_be16(val),port)
+#define mmio_write32be(val,port) __raw_writel(cpu_to_be32(val),port)
 #endif
 
 void iowrite8(u8 val, void __iomem *addr)
-- 
2.11.0



[PATCH v13 00/10] Add io{read|write}64 to io-64-atomic headers

2018-03-21 Thread Logan Gunthorpe

This is v13 of my cleanup series to push a number of instances of people
defining their own io{read|write}64 functions into common headers seing
they don't exist in non-64bit systems. This series adds inline functions to the
io-64-nonatomic headers and then cleans up the drivers that defined their
own copies.

This cleanup was originally requested by Greg after he reviewed my
Switchtec NTB code. And I hope someone can pick it up or at least give
feedback on it soon as it's been around relatively unchanged for a few
cycles now and I'm getting a bit tired of resubmitting it with little to
no interest.

Thanks,

Logan

--

Changes since v12:
- Rebased onto v4.16-rc6
- Split patch 0001 into two and reworked the commit log as requested
  by Luc Van Oostenryck

Changes since v11:
- Rebased onto v4.16-rc5
- Added a patch (0001) to fix some old and new sparse warnings
  that the kbuild robot warned about this cycle. The latest version
  of sparse was required to reproduce these.
- Added a patch (0002) to add io{read|write}64 to parisc which the kbuild
  robot also found errors for this cycle

Changes since v10:
- Rebased onto v4.16-rc4, this droped the drm/tilcdc patch which was
  picked up by that tree and is already in 4.16.

Changes since v9:
- Rebased onto v4.15-rc6
- Fixed a couple of issues in the new version of the CAAM patch as
  pointed out by Horia

Changes since v8:
- Rebased onto v4.15-rc2, as a result rewrote patch 7 seeing someone did
  some similar cleanup in that area.
- Added a patch to clean up the Switchtec NTB driver which landed in
  v4.15-rc1

Changes since v7:
- Fix minor nits from Andy Shevchenko
- Rebased onto v4.14-rc1

Changes since v6:
 ** none **

Changes since v5:
- Added a fix to the tilcdc driver to ensure it doesn't use the
  non-atomic operation. (This includes adding io{read|write}64[be]_is_nonatomic
  defines).

Changes since v4:
- Add functions so the powerpc implementation of iomap.c compiles. (As
  noticed by Horia)

Changes since v3:

- I noticed powerpc didn't use the appropriate functions seeing
  readq/writeq were not defined when iomap.h was included. Thus I've
  included a patch to adjust this
- Fixed some mistakes with a couple of the defines in io-64-nonatomic*
  headers
- Fixed a typo noticed by Horia.

(earlier versions were drastically different)

--

Logan Gunthorpe (10):
  iomap: Use correct endian conversion function in mmio_writeXXbe
  iomap: Fix sparse endian check warnings
  parisc: iomap: introduce io{read|write}64
  powerpc: io.h: move iomap.h include so that it can use readq/writeq
defs
  powerpc: iomap.c: introduce io{read|write}64_{lo_hi|hi_lo}
  iomap: introduce io{read|write}64_{lo_hi|hi_lo}
  io-64-nonatomic: add io{read|write}64[be]{_lo_hi|_hi_lo} macros
  ntb: ntb_hw_intel: use io-64-nonatomic instead of in-driver hacks
  crypto: caam: cleanup CONFIG_64BIT ifdefs when using io{read|write}64
  ntb: ntb_hw_switchtec: Cleanup 64bit IO defines to use the common
header

 arch/parisc/include/asm/io.h   |   9 +++
 arch/parisc/lib/iomap.c|  64 +++
 arch/powerpc/include/asm/io.h  |   6 +-
 arch/powerpc/kernel/iomap.c|  40 ++
 drivers/crypto/caam/regs.h |  30 +--
 drivers/ntb/hw/intel/ntb_hw_intel.c|  30 +--
 drivers/ntb/hw/mscc/ntb_hw_switchtec.c |  36 +
 include/asm-generic/iomap.h|  26 --
 include/linux/io-64-nonatomic-hi-lo.h  |  64 +++
 include/linux/io-64-nonatomic-lo-hi.h  |  64 +++
 lib/iomap.c| 140 -
 11 files changed, 409 insertions(+), 100 deletions(-)

--
2.11.0


[RFC] SHASH_DESC_ON_STACK macro

2018-03-21 Thread Gustavo A. R. Silva

Hi Herbert,

There is an ongoing effort to remove all VLAs from the code base [1] and 
while working on that I came across the following macro at 
include/crypto/hash.h:154:


#define SHASH_DESC_ON_STACK(shash, ctx)   \
char __##shash##_desc[sizeof(struct shash_desc) + \
crypto_shash_descsize(ctx)] CRYPTO_MINALIGN_ATTR; \
struct shash_desc *shash = (struct shash_desc *)__##shash##_desc


Currently, this macro is being used in 46 different places.

I wonder how big can tfm->descsize can get?

Do you think it is feasible to replace the call to crypto_shash_descsize 
with a constant value and get rid of 46 VLA warnings?


I have sent some patches to replace the use of this macro with dynamic 
memory allocation instead, but it seems that this is not a suitable 
solution for all cases due to performance issues.


[1] https://lkml.org/lkml/2018/3/7/621

I'd really appreciate any feedback.
Thanks
--
Gustavo


Re: [PATCH] crypto: atmel-aes - fix the keys zeroing on errors

2018-03-21 Thread Antoine Tenart
Hi Tudor,

On Wed, Mar 21, 2018 at 03:15:27PM +0200, Tudor Ambarus wrote:
> On 02/23/2018 02:37 PM, Antoine Tenart wrote:
> > On Fri, Feb 23, 2018 at 02:04:33PM +0200, Tudor Ambarus wrote:
> > > 
> > > There are few other places in crypto where we extract the authenc keys
> > > in the same type of local variable, struct crypto_authenc_keys keys, and
> > > we don't zeroize it after use. I think we should fix those cases too.
> > 
> > You're right, I spotted other places where keys weren't zeroed. I
> > haven't got the time to do anything about it so far :)
> > 
> 
> Will you send a patch to fix those cases? We will forget about it.
> I can do it myself, but it's better to ask first.

The sooner those issues are fixed the better, and I haven't done
anything yet. So yes, if you can, please take care of this :)

Thanks,
Antoine

-- 
Antoine Ténart, Bootlin (formerly Free Electrons)
Embedded Linux and Kernel engineering
https://bootlin.com


Re: [PATCH] crypto: atmel-aes - fix the keys zeroing on errors

2018-03-21 Thread Tudor Ambarus

Hi, Antoine,

On 02/23/2018 02:37 PM, Antoine Tenart wrote:

Hi Tudor,

On Fri, Feb 23, 2018 at 02:04:33PM +0200, Tudor Ambarus wrote:


There are few other places in crypto where we extract the authenc keys
in the same type of local variable, struct crypto_authenc_keys keys, and
we don't zeroize it after use. I think we should fix those cases too.


You're right, I spotted other places where keys weren't zeroed. I
haven't got the time to do anything about it so far :)



Will you send a patch to fix those cases? We will forget about it.
I can do it myself, but it's better to ask first.

best,
ta


Re: [PATCH 0/1] cover-letter/lz4: Implement lz4 with dynamic offset length.

2018-03-21 Thread Sergey Senozhatsky
CC-ing Nick

Nick, can you take a look?

Message-IDs:
lkml.kernel.org/r/1521607242-3968-1-git-send-email-maninder...@samsung.com
lkml.kernel.org/r/1521607242-3968-2-git-send-email-maninder...@samsung.com

-ss


Re: [PATCH 1/1] lz4: Implement lz4 with dynamic offset length.

2018-03-21 Thread Sergey Senozhatsky
On (03/21/18 10:10), Maninder Singh wrote:
[..]
> +static struct crypto_alg alg_lz4_dyn = {
> + .cra_name   = "lz4_dyn",
> + .cra_flags  = CRYPTO_ALG_TYPE_COMPRESS,
> + .cra_ctxsize= sizeof(struct lz4_ctx),
> + .cra_module = THIS_MODULE,
> + .cra_list   = LIST_HEAD_INIT(alg_lz4_dyn.cra_list),
> + .cra_init   = lz4_init,
> + .cra_exit   = lz4_exit,
> + .cra_u  = { .compress = {
> + .coa_compress   = lz4_compress_crypto_dynamic,
> + .coa_decompress = lz4_decompress_crypto_dynamic } }
> +};

[..]

> diff --git a/drivers/block/zram/zcomp.c b/drivers/block/zram/zcomp.c
> index 4ed0a78..5bc5aab 100644
> --- a/drivers/block/zram/zcomp.c
> +++ b/drivers/block/zram/zcomp.c
> @@ -17,11 +17,15 @@
>  #include 
>  
>  #include "zcomp.h"
> +#define KB   (1 << 10)
>  
>  static const char * const backends[] = {
>   "lzo",
>  #if IS_ENABLED(CONFIG_CRYPTO_LZ4)
>   "lz4",
> +#if (PAGE_SIZE < (32 * KB))
> + "lz4_dyn",
> +#endif

This is not the list of supported algorithms. It's the list of
recommended algorithms. You can configure zram to use any of
available and known to Crypto API algorithms. Including lz4_dyn
on PAGE_SIZE > 32K systems.

-ss


Re: [PATCH 1/5 v4] add compression algorithm zBeWalgo

2018-03-21 Thread Benjamin Warnke
Hi Philippe,


> Am 20.03.2018 um 17:30 schrieb Philippe Ombredanne :
> 
> Hi Benjamin,
> 
> On Tue, Mar 20, 2018 at 7:04 AM, Benjamin Warnke
> <4bwar...@informatik.uni-hamburg.de> wrote:
>> zBeWalgo is a completely new algorithm - Currently it is not published
>> somewhere else right now, googleing it would not show up any results. The
>> following section describes how the algorithm works.
> 
> 
> 
>> diff --git a/lib/zbewalgo/zbewalgo.c b/lib/zbewalgo/zbewalgo.c
>> new file mode 100644
>> index 0..ef922bc27
>> --- /dev/null
>> +++ b/lib/zbewalgo/zbewalgo.c
>> @@ -0,0 +1,723 @@
>> +/*
>> + * Copyright (c) 2018 Benjamin Warnke <4bwar...@informatik.uni-hamburg.de>
>> + *
>> + * This program is free software; you can redistribute it and/or modify it
>> + * under the terms of the GNU General Public License version 2 as published 
>> by
>> + * the Free Software Foundation.
>> + *
>> + * This program is distributed in the hope that it will be useful, but 
>> WITHOUT
>> + * ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or
>> + * FITNESS FOR A PARTICULAR PURPOSE. See the GNU General Public License for
>> + * more details.
>> + *
>> + * You should have received a copy of the GNU General Public License along 
>> with
>> + * this program.
>> + *
> 
> Would you mind using SPDX ids [1] instead of this fine boilerplate
> here and throughout your patches?

Ok, I will use 

/* SPDX-License-Identifier: GPL-2.0 */
/*
 * Copyright (c) 2018 Benjamin Warnke <4bwar...@informatik.uni-hamburg.de>
...

at the top of my files instead of that boilerplate text. And

MODULE_LICENSE("GPL");

at the bottom of the module-files.


> 
> 
>> +MODULE_LICENSE("GPL");
>> +MODULE_DESCRIPTION("zBeWalgo Compression Algorithm");
> 
> Here your MODULE_LICENSE does not match your top level license. See
> module.h [2] for a description of values: GPL would mean "GNU Public
> License v2 or later" whereas your top level license (best expressed
> with SPDX) would mean GPL-2.0 and no other version. To avoid
> confusion, you would need to state the same thing in the
> MODULE_LICENSE and your SPDX tags.

I used the file "crypto/lz4.c" - since it is a compression algorithm too - as 
an example of how to format the licensing text.
Unfortunately there is the same 'error'.
I fixed this error in all of my files in all patches.

Cordially
Benjamin Warnke


Re: [PATCH 0/1] cover-letter/lz4: Implement lz4 with dynamic offset length.

2018-03-21 Thread Sergey Senozhatsky
On (03/21/18 10:10), Maninder Singh wrote:
> (Added cover letter to avoid much text in patch description)
> 
> LZ4 specification defines 2 byte offset length for 64 KB data.
> But in case of ZRAM we compress data per page and in most of
> architecture PAGE_SIZE is 4KB. So we can decide offset length based
> on actual offset value. For this we can reserve 1 bit to decide offset
> length (1 byte or 2 byte). 2 byte required only if ofsset is greater than 127,
> else 1 byte is enough.

So what happens if I compress the data on a system with no dyn
offset and then send it over the network to a machine which has
dyn offset? Or, say, I have a USB stick with a compression enabled
FS, store files on a dyn offset enabled PC and then mount that USB
stick on a machine with no dyn offset support. And vice versa.

-ss