Re: AES-NI: slower than aes-generic?

2016-05-26 Thread Theodore Ts'o
On Thu, May 26, 2016 at 08:49:39PM +0200, Stephan Mueller wrote:
> 
> Using the kernel crypto API one can relieve the CPU of the crypto work, if a 
> hardware or assembler implementation is available. This may be of particular 
> interest for smaller systems. So, for smaller systems (where kernel bloat is 
> bad, but where now these days more and more hardware crypto support is 
> added), 
> we must weigh the kernel bloat (of 3 or 4 additional C files for the basic 
> kernel crypto API logic) against relieving the CPU of work.

There are a number of caveats with using hardware acceleration; one is
that many hardware accelerators are optimized for bulk data
encryption, and so key scheduling, or switching between key schedules,
can have a higher overhead that a pure software implementation.

There have also been situations where the hardware crypto engine is
actually slower than a highly optimized software implementation.  This
has been the case for certain ARM SOC's, for example.

This is not that big of deal, if you are developing a cryptographic
application (such as file system level encryption, for example) for a
specific hardware platform (such as a specific Nexus device).  But if
you are trying to develop a generic service that has to work on a wide
variety of CPU architectures, and specific CPU/SOC implementations,
life is a lot more complicated.  I've worked on both problems, let me
assure you the second is way tricker than the first.

> Then, the use of the DRBG offers users to choose between a Hash/HMAC and CTR 
> implementation to suit their needs. The DRBG code is agnostic of the 
> underlying cipher. So, you could even use Blowfish instead of AES or 
> whirlpool 
> instead of SHA -- these changes are just one entry in drbg_cores[] away 
> without any code change.

I really question how much this matters in practice.  Unless you are a
US Government Agency, where you might be laboring under a Federal
mandate to use DUAL-EC DRBG (for example), most users really don't
care about the details of the algorithm used in their random number
generator.  Giving users choice (or lots of knobs) isn't necessarily
always a feature, as the many TLS downgrade attacks have demonstrated.

This is why from my perspectve it's more important to implement an
interface which is always there, and which by default is secure, to
minimize the chances that random JV-team kernel developers working for
a Linux distribution or some consumer electronics manufacturer won't
actually make things worse.  As the Debian's attempt to "improve" the
security of OpenSSL demonstrates, it doesn't always end well.  :-)

If we implement something which happens to result in a 2 minute stall
in boot times, the danger is that a clueless engineer at Sony, or LGE,
or Motorola, or BMW, or Toyota, etc, will "fix" the problem without
telling anyone about what they did, and we might not notice right away
that the fix was in fact catastrophically bad.

These aren't the standard things which academics tend to worry about,
which tend to assume that attackers will be able to read arbitrary
kernel memory, and recovering such an exposure of the entropy pool is
_the_ most important thing to worry about (as opposed to say, the
contents of the user's private keys in the ssh-agent process).  But
this will perhaps explain why worrying about accomodating users who
care about whether Blowfish or AES should be used in their random
number generator isn't near the top of my personal priority list.

Cheers,

- Ted

--
To unsubscribe from this list: send the line "unsubscribe linux-crypto" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH] KEYS: Add placeholder for KDF usage with DH

2016-05-26 Thread David Howells
Hi James,

Could you pass this along to Linus as soon as possible, please?  This
alters a new keyctl function added in the current merge window to allow for
a future extension planned for the next merge window.

Thanks,
David
---
From: Stephan Mueller 
Date: Thu May 26 23:38:12 2016 +0200

KEYS: Add placeholder for KDF usage with DH

The values computed during Diffie-Hellman key exchange are often used
in combination with key derivation functions to create cryptographic
keys.  Add a placeholder for a later implementation to configure a
key derivation function that will transform the Diffie-Hellman
result returned by the KEYCTL_DH_COMPUTE command.

[This patch was stripped down from a patch produced by Mat Martineau that
 had a bug in the compat code - so for the moment Stephan's patch simply
 requires that the placeholder argument must be NULL]

Original-signed-off-by: Mat Martineau 
Signed-off-by: Stephan Mueller 
Signed-off-by: David Howells 
---
 Documentation/security/keys.txt |5 -
 security/keys/compat.c  |2 +-
 security/keys/dh.c  |8 +++-
 security/keys/internal.h|5 +++--
 security/keys/keyctl.c  |4 ++--
 5 files changed, 17 insertions(+), 7 deletions(-)

diff --git a/Documentation/security/keys.txt b/Documentation/security/keys.txt
index 20d05719bceb..3849814bfe6d 100644
--- a/Documentation/security/keys.txt
+++ b/Documentation/security/keys.txt
@@ -826,7 +826,8 @@ The keyctl syscall functions are:
  (*) Compute a Diffie-Hellman shared secret or public key
 
long keyctl(KEYCTL_DH_COMPUTE, struct keyctl_dh_params *params,
-  char *buffer, size_t buflen);
+  char *buffer, size_t buflen,
+  void *reserved);
 
  The params struct contains serial numbers for three keys:
 
@@ -843,6 +844,8 @@ The keyctl syscall functions are:
  public key.  If the base is the remote public key, the result is
  the shared secret.
 
+ The reserved argument must be set to NULL.
+
  The buffer length must be at least the length of the prime, or zero.
 
  If the buffer length is nonzero, the length of the result is
diff --git a/security/keys/compat.c b/security/keys/compat.c
index c8783b3b628c..36c80bf5b89c 100644
--- a/security/keys/compat.c
+++ b/security/keys/compat.c
@@ -134,7 +134,7 @@ COMPAT_SYSCALL_DEFINE5(keyctl, u32, option,
 
case KEYCTL_DH_COMPUTE:
return keyctl_dh_compute(compat_ptr(arg2), compat_ptr(arg3),
-arg4);
+arg4, compat_ptr(arg5));
 
default:
return -EOPNOTSUPP;
diff --git a/security/keys/dh.c b/security/keys/dh.c
index 880505a4b9f1..531ed2ec132f 100644
--- a/security/keys/dh.c
+++ b/security/keys/dh.c
@@ -78,7 +78,8 @@ error:
 }
 
 long keyctl_dh_compute(struct keyctl_dh_params __user *params,
-  char __user *buffer, size_t buflen)
+  char __user *buffer, size_t buflen,
+  void __user *reserved)
 {
long ret;
MPI base, private, prime, result;
@@ -97,6 +98,11 @@ long keyctl_dh_compute(struct keyctl_dh_params __user 
*params,
goto out;
}
 
+   if (reserved) {
+   ret = -EINVAL;
+   goto out;
+   }
+
keylen = mpi_from_key(pcopy.prime, buflen, );
if (keylen < 0 || !prime) {
/* buflen == 0 may be used to query the required buffer size,
diff --git a/security/keys/internal.h b/security/keys/internal.h
index 8ec7a528365d..a705a7d92ad7 100644
--- a/security/keys/internal.h
+++ b/security/keys/internal.h
@@ -260,10 +260,11 @@ static inline long keyctl_get_persistent(uid_t uid, 
key_serial_t destring)
 
 #ifdef CONFIG_KEY_DH_OPERATIONS
 extern long keyctl_dh_compute(struct keyctl_dh_params __user *, char __user *,
- size_t);
+ size_t, void __user *);
 #else
 static inline long keyctl_dh_compute(struct keyctl_dh_params __user *params,
-char __user *buffer, size_t buflen)
+char __user *buffer, size_t buflen,
+void __user *reserved)
 {
return -EOPNOTSUPP;
 }
diff --git a/security/keys/keyctl.c b/security/keys/keyctl.c
index 3b135a0af344..d580ad06b792 100644
--- a/security/keys/keyctl.c
+++ b/security/keys/keyctl.c
@@ -1688,8 +1688,8 @@ SYSCALL_DEFINE5(keyctl, int, option, unsigned long, arg2, 
unsigned long, arg3,
 
case KEYCTL_DH_COMPUTE:
return keyctl_dh_compute((struct keyctl_dh_params __user *) 
arg2,
-(char __user *) arg3,
-(size_t) arg4);
+(char __user *) arg3, (size_t) arg4,
+  

[PATCH v2] KEYS: Add placeholder KDF usage with DH

2016-05-26 Thread Stephan Mueller
Am Donnerstag, 26. Mai 2016, 21:10:50 schrieb David Howells:

Hi David,

Mat asked me to step in in case his patch needed an update.

---8<---
The values computed during Diffie-Hellman key exchange are often used
in combination with key derivation functions to create cryptographic
keys. Add a placeholder for a later implementation to configure a
key derivation function that will transform the Diffie-Hellman
result returned by the KEYCTL_DH_COMPUTE command.

Signed-off-by: Mat Martineau 
Signed-off-by: Stephan Mueller 
---
 Documentation/security/keys.txt | 5 -
 security/keys/compat.c  | 2 +-
 security/keys/dh.c  | 8 +++-
 security/keys/internal.h| 5 +++--
 security/keys/keyctl.c  | 4 ++--
 5 files changed, 17 insertions(+), 7 deletions(-)

diff --git a/Documentation/security/keys.txt b/Documentation/security/keys.txt
index 20d0571..3849814 100644
--- a/Documentation/security/keys.txt
+++ b/Documentation/security/keys.txt
@@ -826,7 +826,8 @@ The keyctl syscall functions are:
  (*) Compute a Diffie-Hellman shared secret or public key
 
long keyctl(KEYCTL_DH_COMPUTE, struct keyctl_dh_params *params,
-  char *buffer, size_t buflen);
+  char *buffer, size_t buflen,
+  void *reserved);
 
  The params struct contains serial numbers for three keys:
 
@@ -843,6 +844,8 @@ The keyctl syscall functions are:
  public key.  If the base is the remote public key, the result is
  the shared secret.
 
+ The reserved argument must be set to NULL.
+
  The buffer length must be at least the length of the prime, or zero.
 
  If the buffer length is nonzero, the length of the result is
diff --git a/security/keys/compat.c b/security/keys/compat.c
index c8783b3..36c80bf 100644
--- a/security/keys/compat.c
+++ b/security/keys/compat.c
@@ -134,7 +134,7 @@ COMPAT_SYSCALL_DEFINE5(keyctl, u32, option,
 
case KEYCTL_DH_COMPUTE:
return keyctl_dh_compute(compat_ptr(arg2), compat_ptr(arg3),
-arg4);
+arg4, compat_ptr(arg5));
 
default:
return -EOPNOTSUPP;
diff --git a/security/keys/dh.c b/security/keys/dh.c
index 880505a..531ed2e 100644
--- a/security/keys/dh.c
+++ b/security/keys/dh.c
@@ -78,7 +78,8 @@ error:
 }
 
 long keyctl_dh_compute(struct keyctl_dh_params __user *params,
-  char __user *buffer, size_t buflen)
+  char __user *buffer, size_t buflen,
+  void __user *reserved)
 {
long ret;
MPI base, private, prime, result;
@@ -97,6 +98,11 @@ long keyctl_dh_compute(struct keyctl_dh_params __user 
*params,
goto out;
}
 
+   if (reserved) {
+   ret = -EINVAL;
+   goto out;
+   }
+
keylen = mpi_from_key(pcopy.prime, buflen, );
if (keylen < 0 || !prime) {
/* buflen == 0 may be used to query the required buffer size,
diff --git a/security/keys/internal.h b/security/keys/internal.h
index 8ec7a52..a705a7d 100644
--- a/security/keys/internal.h
+++ b/security/keys/internal.h
@@ -260,10 +260,11 @@ static inline long keyctl_get_persistent(uid_t uid, 
key_serial_t destring)
 
 #ifdef CONFIG_KEY_DH_OPERATIONS
 extern long keyctl_dh_compute(struct keyctl_dh_params __user *, char __user *,
- size_t);
+ size_t, void __user *);
 #else
 static inline long keyctl_dh_compute(struct keyctl_dh_params __user *params,
-char __user *buffer, size_t buflen)
+char __user *buffer, size_t buflen,
+void __user *reserved)
 {
return -EOPNOTSUPP;
 }
diff --git a/security/keys/keyctl.c b/security/keys/keyctl.c
index 3b135a0..d580ad0 100644
--- a/security/keys/keyctl.c
+++ b/security/keys/keyctl.c
@@ -1688,8 +1688,8 @@ SYSCALL_DEFINE5(keyctl, int, option, unsigned long, arg2, 
unsigned long, arg3,
 
case KEYCTL_DH_COMPUTE:
return keyctl_dh_compute((struct keyctl_dh_params __user *) 
arg2,
-(char __user *) arg3,
-(size_t) arg4);
+(char __user *) arg3, (size_t) arg4,
+(void __user *) arg5);
 
default:
return -EOPNOTSUPP;
-- 
2.5.5


--
To unsubscribe from this list: send the line "unsubscribe linux-crypto" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


RE: [PATCH 2/7] crypto : async implementation for sha1-mb

2016-05-26 Thread Yu, Fenghua
> From: Dey, Megha
> Sent: Thursday, May 19, 2016 5:43 PM
> To: herb...@gondor.apana.org.au
> Cc: tim.c.c...@linux.intel.com; da...@davemloft.net; linux-
> cry...@vger.kernel.org; linux-ker...@vger.kernel.org; Dey, Megha
> ; Yu, Fenghua ; Megha
> Dey 
> Subject: [PATCH 2/7] crypto : async implementation for sha1-mb
> 
> From: Megha Dey 
> 
> Herbert wants the sha1-mb algorithm to have an async implementation:
> https://lkml.org/lkml/2016/4/5/286.
> Currently, sha1-mb uses an async interface for the outer algorithm and a sync
> interface for the inner algorithm. This patch introduces a async interface for
> even the inner algorithm.
> 
> Signed-off-by: Megha Dey 
> Signed-off-by: Tim Chen 
> ---
>  arch/x86/crypto/sha-mb/sha1_mb.c | 190 ++--
> ---
>  crypto/ahash.c   |   6 --
>  crypto/mcryptd.c | 117 +---
>  include/crypto/hash.h|   6 ++
>  include/crypto/internal/hash.h   |   8 +-
>  include/crypto/mcryptd.h |   8 +-
>  6 files changed, 184 insertions(+), 151 deletions(-)
> 

Hi, Herbert,

Any comment/feedback on this patch set? Is there any plan to push it to 
upstream?

Thanks.

-Fenghua
--
To unsubscribe from this list: send the line "unsubscribe linux-crypto" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH 2/5] lib/digsig: digsig_verify_rsa(): return -EINVAL if modulo length is zero

2016-05-26 Thread Nicolai Stange
Currently, if digsig_verify_rsa() detects that the modulo's length is zero,
i.e. mlen == 0, it returns -ENOMEM which doesn't really fit here.

Make digsig_verify_rsa() return -EINVAL upon mlen == 0.

Signed-off-by: Nicolai Stange 
---
 lib/digsig.c | 8 +---
 1 file changed, 5 insertions(+), 3 deletions(-)

diff --git a/lib/digsig.c b/lib/digsig.c
index a121cbc..55b8b2f 100644
--- a/lib/digsig.c
+++ b/lib/digsig.c
@@ -114,13 +114,15 @@ static int digsig_verify_rsa(struct key *key,
datap += remaining;
}
 
-   err = -ENOMEM;
-
mblen = mpi_get_nbits(pkey[0]);
mlen = DIV_ROUND_UP(mblen, 8);
 
-   if (mlen == 0)
+   if (mlen == 0) {
+   err = -EINVAL;
goto err;
+   }
+
+   err = -ENOMEM;
 
out1 = kzalloc(mlen, GFP_KERNEL);
if (!out1)
-- 
2.8.2

--
To unsubscribe from this list: send the line "unsubscribe linux-crypto" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH 1/5] lib/mpi: mpi_read_from_buffer(): return error code

2016-05-26 Thread Nicolai Stange
mpi_read_from_buffer() reads a MPI from a buffer into a newly allocated
MPI instance. It expects the buffer's leading two bytes to contain the
number of bits, followed by the actual payload.

On failure, it returns NULL and updates the in/out argument ret_nread
somewhat inconsistently:
- If the given buffer is too short to contain the leading two bytes
  encoding the number of bits or their value is unsupported, then
  ret_nread will be cleared.
- If the allocation of the resulting MPI instance fails, ret_nread is left
  as is.

The only user of mpi_read_from_buffer(), digsig_verify_rsa(), simply checks
for a return value of NULL and returns -ENOMEM if that happens.

While this is all of cosmetic nature only, there is another error condition
which currently isn't detectable by the caller of mpi_read_from_buffer():
if the given buffer is too small to hold the number of bits as encoded in
its first two bytes, the return value will be non-NULL and *ret_nread > 0.

In preparation of communicating this condition to the caller, let
mpi_read_from_buffer() return error values by means of the ERR_PTR()
mechanism.

Make the sole caller of mpi_read_from_buffer(), digsig_verify_rsa(),
check the return value for IS_ERR() rather than == NULL. If IS_ERR() is
true, return the associated error value rather than the fixed -ENOMEM.

Signed-off-by: Nicolai Stange 
---
 lib/digsig.c   | 12 
 lib/mpi/mpicoder.c |  6 +++---
 2 files changed, 11 insertions(+), 7 deletions(-)

diff --git a/lib/digsig.c b/lib/digsig.c
index 07be6c1..a121cbc 100644
--- a/lib/digsig.c
+++ b/lib/digsig.c
@@ -104,16 +104,18 @@ static int digsig_verify_rsa(struct key *key,
datap = pkh->mpi;
endp = ukp->data + ukp->datalen;
 
-   err = -ENOMEM;
-
for (i = 0; i < pkh->nmpi; i++) {
unsigned int remaining = endp - datap;
pkey[i] = mpi_read_from_buffer(datap, );
-   if (!pkey[i])
+   if (IS_ERR(pkey[i])) {
+   err = PTR_ERR(pkey[i]);
goto err;
+   }
datap += remaining;
}
 
+   err = -ENOMEM;
+
mblen = mpi_get_nbits(pkey[0]);
mlen = DIV_ROUND_UP(mblen, 8);
 
@@ -126,8 +128,10 @@ static int digsig_verify_rsa(struct key *key,
 
nret = siglen;
in = mpi_read_from_buffer(sig, );
-   if (!in)
+   if (IS_ERR(in)) {
+   err = PTR_ERR(in);
goto err;
+   }
 
res = mpi_alloc(mpi_get_nlimbs(in) * 2);
if (!res)
diff --git a/lib/mpi/mpicoder.c b/lib/mpi/mpicoder.c
index 747606f..275c71e 100644
--- a/lib/mpi/mpicoder.c
+++ b/lib/mpi/mpicoder.c
@@ -88,12 +88,12 @@ MPI mpi_read_from_buffer(const void *xbuffer, unsigned 
*ret_nread)
MPI val = NULL;
 
if (*ret_nread < 2)
-   goto leave;
+   return ERR_PTR(-EINVAL);
nbits = buffer[0] << 8 | buffer[1];
 
if (nbits > MAX_EXTERN_MPI_BITS) {
pr_info("MPI: mpi too large (%u bits)\n", nbits);
-   goto leave;
+   return ERR_PTR(-EINVAL);
}
buffer += 2;
nread = 2;
@@ -102,7 +102,7 @@ MPI mpi_read_from_buffer(const void *xbuffer, unsigned 
*ret_nread)
nlimbs = DIV_ROUND_UP(nbytes, BYTES_PER_MPI_LIMB);
val = mpi_alloc(nlimbs);
if (!val)
-   return NULL;
+   return ERR_PTR(-ENOMEM);
i = BYTES_PER_MPI_LIMB - nbytes % BYTES_PER_MPI_LIMB;
i %= BYTES_PER_MPI_LIMB;
val->nbits = nbits;
-- 
2.8.2

--
To unsubscribe from this list: send the line "unsubscribe linux-crypto" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH 5/5] lib/mpi: refactor mpi_read_from_buffer() in terms of mpi_read_raw_data()

2016-05-26 Thread Nicolai Stange
mpi_read_from_buffer() and mpi_read_raw_data() do basically the same thing
except that the former extracts the number of payload bits from the first
two bytes of the input buffer.

Besides that, the data copying logic is exactly the same.

Replace the open coded buffer to MPI instance conversion by a call to
mpi_read_raw_data().

Signed-off-by: Nicolai Stange 
---
 lib/mpi/mpicoder.c | 24 +++-
 1 file changed, 3 insertions(+), 21 deletions(-)

diff --git a/lib/mpi/mpicoder.c b/lib/mpi/mpicoder.c
index 2f4d039..e8a5742 100644
--- a/lib/mpi/mpicoder.c
+++ b/lib/mpi/mpicoder.c
@@ -82,10 +82,8 @@ EXPORT_SYMBOL_GPL(mpi_read_raw_data);
 MPI mpi_read_from_buffer(const void *xbuffer, unsigned *ret_nread)
 {
const uint8_t *buffer = xbuffer;
-   int i, j;
-   unsigned nbits, nbytes, nlimbs;
-   mpi_limb_t a;
-   MPI val = NULL;
+   unsigned int nbits, nbytes;
+   MPI val;
 
if (*ret_nread < 2)
return ERR_PTR(-EINVAL);
@@ -95,7 +93,6 @@ MPI mpi_read_from_buffer(const void *xbuffer, unsigned 
*ret_nread)
pr_info("MPI: mpi too large (%u bits)\n", nbits);
return ERR_PTR(-EINVAL);
}
-   buffer += 2;
 
nbytes = DIV_ROUND_UP(nbits, 8);
if (nbytes + 2 > *ret_nread) {
@@ -104,24 +101,9 @@ MPI mpi_read_from_buffer(const void *xbuffer, unsigned 
*ret_nread)
return ERR_PTR(-EINVAL);
}
 
-   nlimbs = DIV_ROUND_UP(nbytes, BYTES_PER_MPI_LIMB);
-   val = mpi_alloc(nlimbs);
+   val = mpi_read_raw_data(buffer + 2, nbytes);
if (!val)
return ERR_PTR(-ENOMEM);
-   i = BYTES_PER_MPI_LIMB - nbytes % BYTES_PER_MPI_LIMB;
-   i %= BYTES_PER_MPI_LIMB;
-   val->nbits = nbits;
-   j = val->nlimbs = nlimbs;
-   val->sign = 0;
-   for (; j > 0; j--) {
-   a = 0;
-   for (; i < BYTES_PER_MPI_LIMB; i++) {
-   a <<= 8;
-   a |= *buffer++;
-   }
-   i = 0;
-   val->d[j - 1] = a;
-   }
 
*ret_nread = nbytes + 2;
return val;
-- 
2.8.2

--
To unsubscribe from this list: send the line "unsubscribe linux-crypto" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH 3/5] lib/mpi: mpi_read_from_buffer(): return -EINVAL upon too short buffer

2016-05-26 Thread Nicolai Stange
Currently, if the input buffer is shorter than the expected length as
indicated by its first two bytes, an MPI instance of this expected length
will be allocated and filled with as much data as is available. The rest
will remain uninitialized.

Instead of leaving this condition undetected, an error code should be
reported to the caller.

Since this situation indicates that the input buffer's first two bytes,
encoding the number of expected bits, are garbled, -EINVAL is appropriate
here.

If the input buffer is shorter than indicated by its first two bytes,
make mpi_read_from_buffer() return -EINVAL.
Get rid of the 'nread' variable: with the new semantics, the total number
of bytes read from the input buffer is known in advance.

Signed-off-by: Nicolai Stange 
---
 lib/mpi/mpicoder.c | 18 --
 1 file changed, 8 insertions(+), 10 deletions(-)

diff --git a/lib/mpi/mpicoder.c b/lib/mpi/mpicoder.c
index 275c71e..869c66c 100644
--- a/lib/mpi/mpicoder.c
+++ b/lib/mpi/mpicoder.c
@@ -83,7 +83,7 @@ MPI mpi_read_from_buffer(const void *xbuffer, unsigned 
*ret_nread)
 {
const uint8_t *buffer = xbuffer;
int i, j;
-   unsigned nbits, nbytes, nlimbs, nread = 0;
+   unsigned nbits, nbytes, nlimbs;
mpi_limb_t a;
MPI val = NULL;
 
@@ -96,9 +96,14 @@ MPI mpi_read_from_buffer(const void *xbuffer, unsigned 
*ret_nread)
return ERR_PTR(-EINVAL);
}
buffer += 2;
-   nread = 2;
 
nbytes = DIV_ROUND_UP(nbits, 8);
+   if (nbytes + 2 > *ret_nread) {
+   printk("MPI: mpi larger than buffer nread=%d ret_nread=%d\n",
+   *ret_nread + 1, *ret_nread);
+   return ERR_PTR(-EINVAL);
+   }
+
nlimbs = DIV_ROUND_UP(nbytes, BYTES_PER_MPI_LIMB);
val = mpi_alloc(nlimbs);
if (!val)
@@ -111,12 +116,6 @@ MPI mpi_read_from_buffer(const void *xbuffer, unsigned 
*ret_nread)
for (; j > 0; j--) {
a = 0;
for (; i < BYTES_PER_MPI_LIMB; i++) {
-   if (++nread > *ret_nread) {
-   printk
-   ("MPI: mpi larger than buffer nread=%d 
ret_nread=%d\n",
-nread, *ret_nread);
-   goto leave;
-   }
a <<= 8;
a |= *buffer++;
}
@@ -124,8 +123,7 @@ MPI mpi_read_from_buffer(const void *xbuffer, unsigned 
*ret_nread)
val->d[j - 1] = a;
}
 
-leave:
-   *ret_nread = nread;
+   *ret_nread = nbytes + 2;
return val;
 }
 EXPORT_SYMBOL_GPL(mpi_read_from_buffer);
-- 
2.8.2

--
To unsubscribe from this list: send the line "unsubscribe linux-crypto" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH 4/5] lib/mpi: mpi_read_from_buffer(): sanitize short buffer printk

2016-05-26 Thread Nicolai Stange
The first two bytes of the input buffer encode its expected length and
mpi_read_from_buffer() prints a console message if the given buffer is too
short.

However, there are some oddities with how this message is printed:
- It is printed at the default loglevel. This is different from the
  one used in the case that the first two bytes' value is unsupportedly
  large, i.e. KERN_INFO.
- The format specifier '%d' is used for unsigned ints.
- It prints the values of nread and *ret_nread. This is redundant since
  the former is always the latter + 1.

Clean this up as follows:
- Use pr_info() rather than printk() with no loglevel.
- Use the format specifiers '%u' in place if '%d'.
- Do not print the redundant 'nread' but the more helpful 'nbytes' value.

Signed-off-by: Nicolai Stange 
---
 lib/mpi/mpicoder.c | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/lib/mpi/mpicoder.c b/lib/mpi/mpicoder.c
index 869c66c..2f4d039 100644
--- a/lib/mpi/mpicoder.c
+++ b/lib/mpi/mpicoder.c
@@ -99,8 +99,8 @@ MPI mpi_read_from_buffer(const void *xbuffer, unsigned 
*ret_nread)
 
nbytes = DIV_ROUND_UP(nbits, 8);
if (nbytes + 2 > *ret_nread) {
-   printk("MPI: mpi larger than buffer nread=%d ret_nread=%d\n",
-   *ret_nread + 1, *ret_nread);
+   pr_info("MPI: mpi larger than buffer nbytes=%u ret_nread=%u\n",
+   nbytes, *ret_nread);
return ERR_PTR(-EINVAL);
}
 
-- 
2.8.2

--
To unsubscribe from this list: send the line "unsubscribe linux-crypto" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH 0/5] refactor mpi_read_from_buffer()

2016-05-26 Thread Nicolai Stange
mpi_read_from_buffer() and mpi_read_raw_data() do almost the same and share a
fair amount of common code.

This patchset attempts to rewrite mpi_read_from_buffer() in order to implement
it in terms of mpi_read_raw_data().

The patches 1 and 3, i.e.
  "lib/mpi: mpi_read_from_buffer(): return error code"
and
  "lib/mpi: mpi_read_from_buffer(): return -EINVAL upon too short buffer"
do the groundwork in that they move any error detection unique to
mpi_read_from_buffer() out of the data handling loop.

The patches 2 and 4, that is
  "lib/digsig: digsig_verify_rsa(): return -EINVAL if modulo length is zero"
and
  "lib/mpi: mpi_read_from_buffer(): sanitize short buffer printk"
are not strictly necessary for the refactoring: they cleanup some minor oddities
related to error handling I came across.

Finally, the last patch in this series,
  "lib/mpi: refactor mpi_read_from_buffer() in terms of mpi_read_raw_data()"
actually does what this series is all about.


Applicable to linux-next-20160325.

A note on testing: allmodconfig on x86_64 builds fine. 
However, the only caller of mpi_read_from_buffer() is digsig_verify_rsa()
and this one is solely used by the IMA/EVM infrastructure.
In my current setup, I don't have any IMA/EVM stuff in place and thus,
I can't do any runtime tests without putting *much* effort into it.
I would really appreciate if someone with a working IMA/EVM setup could do some
brief testing...

Nicolai Stange (5):
  lib/mpi: mpi_read_from_buffer(): return error code
  lib/digsig: digsig_verify_rsa(): return -EINVAL if modulo length is
zero
  lib/mpi: mpi_read_from_buffer(): return -EINVAL upon too short buffer
  lib/mpi: mpi_read_from_buffer(): sanitize short buffer printk
  lib/mpi: refactor mpi_read_from_buffer() in terms of
mpi_read_raw_data()

 lib/digsig.c   | 16 +++-
 lib/mpi/mpicoder.c | 46 +-
 2 files changed, 24 insertions(+), 38 deletions(-)

-- 
2.8.2

--
To unsubscribe from this list: send the line "unsubscribe linux-crypto" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH] KEYS: Add optional key derivation parameters for DH

2016-05-26 Thread David Howells
Mat Martineau  wrote:

> +struct keyctl_kdf_params {
> + char *name;
> + __u8 reserved[32]; /* Reserved for future use, must be 0 */
> +};
> +
>  #endif /*  _LINUX_KEYCTL_H */
> diff --git a/security/keys/compat.c b/security/keys/compat.c
> index c8783b3..36c80bf 100644
> --- a/security/keys/compat.c
> +++ b/security/keys/compat.c
> @@ -134,7 +134,7 @@ COMPAT_SYSCALL_DEFINE5(keyctl, u32, option,
>  
>   case KEYCTL_DH_COMPUTE:
>   return keyctl_dh_compute(compat_ptr(arg2), compat_ptr(arg3),
> -  arg4);
> +  arg4, compat_ptr(arg5));

Given the new structure above, this won't work.  The problem is that on a
64-bit system the kernel expects 'name' to be a 64-bit pointer, but if we're
in the compat handler, we have a 32-bit userspace's idea of the struct - in
which 'name' is a 31-bit (s390x) or a 32-bit pointer without any padding.

So in compat code you can't just pass the user pointer direct through to
keyctl_dh_compute().  You need to supply a compat_keyctl_kdf_params struct and
translator code.

What I would recommend you do at the moment is to mark the syscall argument as
"reserved, must be 0" and deal with the implementation in the next merge
window.

David

--
To unsubscribe from this list: send the line "unsubscribe linux-crypto" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: AES-NI: slower than aes-generic?

2016-05-26 Thread Sandy Harris
On Thu, May 26, 2016 at 2:49 PM, Stephan Mueller  wrote:

> Then, the use of the DRBG offers users to choose between a Hash/HMAC and CTR
> implementation to suit their needs. The DRBG code is agnostic of the
> underlying cipher. So, you could even use Blowfish instead of AES or whirlpool
> instead of SHA -- these changes are just one entry in drbg_cores[] away
> without any code change.

Not Blowfish in anything like the code you describe! It has only
64-bit blocks which might or might not be a problem, but it also has
an extremely expensive key schedule which would be awful if you want
to rekey often.

I'd say if you want a block cipher there you can quite safely restrict
the interface to ciphers with the same block & key sizes as AES.
Implement AES and one of the other finalists (I'd pick Serpent) to
test, and others can add the remaining finalists or national standards
like Korean ARIA or the Japanese one if they want them.
--
To unsubscribe from this list: send the line "unsubscribe linux-crypto" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: AES-NI: slower than aes-generic?

2016-05-26 Thread Stephan Mueller
Am Donnerstag, 26. Mai 2016, 14:20:19 schrieb Sandy Harris:

Hi Sandy,
> 
> Why are you using AES? Granted, it is a reasonable idea, but when Ted
> replaced the non-blocking pool with a DBRG, he used a different cipher
> (I think chacha, not certain) and I think chose not to use the crypto
> library implementation to avoid kernel bloat.
> 
> So he has adopted on of your better ideas. Why not follow his
> lead on how to implement it?

Using the kernel crypto API one can relieve the CPU of the crypto work, if a 
hardware or assembler implementation is available. This may be of particular 
interest for smaller systems. So, for smaller systems (where kernel bloat is 
bad, but where now these days more and more hardware crypto support is added), 
we must weigh the kernel bloat (of 3 or 4 additional C files for the basic 
kernel crypto API logic) against relieving the CPU of work.

Then, the use of the DRBG offers users to choose between a Hash/HMAC and CTR 
implementation to suit their needs. The DRBG code is agnostic of the 
underlying cipher. So, you could even use Blowfish instead of AES or whirlpool 
instead of SHA -- these changes are just one entry in drbg_cores[] away 
without any code change.

Finally, the LRNG code is completely agnostic of the underlying deterministic 
RNG. You only need a replacement of two small functions to invoke the seeding 
and generate operation of a DRNG. So, if one wants a Chacha20, he can have it. 
If one wants X9.31, he can have it. See section 2.8.3 [1] -- note, that DRNG 
does not even need to be a kernel crypto API registered implementation.

Bottom line, I want to give folks a full flexibility. That said, the LRNG code 
is more of a logic to collect entropy and maintain two DRNG types which are 
seeded according to a defined schedule than it is a tightly integrated RNG.

Also, I am not so sure that simply taking a cipher, sprinkling some 
backtracking logic on it implies you have a good DRNG. As of now, I have not 
seen assessments from others for the Chacha20 DRNG approach. I personally 
would think that the Chacha20 approach from Ted is good. Yet others may have a 
more conservative approach of using a DRNG implementation that has been 
reviewed by a lot of folks.

[1] http://www.chronox.de/lrng/doc/lrng.pdf

Ciao
Stephan
--
To unsubscribe from this list: send the line "unsubscribe linux-crypto" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: AES-NI: slower than aes-generic?

2016-05-26 Thread Sandy Harris
Stephan Mueller  wrote:

> for the DRBG and the LRNG work I am doing, I also test the speed of the DRBG.
> The DRBG can be considered as a form of block chaining mode on top of a raw
> cipher.
>
> What I am wondering is that when encrypting 256 16 byte blocks, I get a speed
> of about 170 MB/s with the AES-NI driver. When using the aes-generic or aes-
> asm, I get up to 180 MB/s with all else being equal. Note, that figure
> includes a copy_to_user of the generated data.

Why are you using AES? Granted, it is a reasonable idea, but when Ted
replaced the non-blocking pool with a DBRG, he used a different cipher
(I think chacha, not certain) and I think chose not to use the crypto
library implementation to avoid kernel bloat.

So he has adopted on of your better ideas. Why not follow his
lead on how to implement it?
--
To unsubscribe from this list: send the line "unsubscribe linux-crypto" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: AES-NI: slower than aes-generic?

2016-05-26 Thread Stephan Mueller
Am Donnerstag, 26. Mai 2016, 19:30:01 schrieb Stephan Mueller:

Hi,

> 
> However, the key difference to a standard speed test is that I set up a new
> key schedule quite frequently. And I would suspect that something is going
> on here...

With tcrypt, there is some interesting hint: on smaller blocks, the C 
implementation is indeed faster:

[   20.391510] 
   testing speed of async ecb(aes) (ecb(aes-generic)) encryption
   20.391513] test 0 (128 bit key, 16 byte blocks): 1 operation in 275 cycles 
(16 bytes)
[   20.391517] test 1 (128 bit key, 64 byte blocks): 1 operation in 702 cycles 
(64 bytes)
[   20.391521] test 2 (128 bit key, 256 byte blocks): 1 operation in 2431 
cycles (256 bytes)
[   20.391532] test 3 (128 bit key, 1024 byte blocks): 1 operation in 9347 
cycles (1024 bytes)
[   20.391570] test 4 (128 bit key, 8192 byte blocks): 1 operation in 74375 
cycles (8192 bytes)


vs for ecb-aes-aesni:

[  143.482123] test 0 (128 bit key, 16 byte blocks): 1 operation in 1203 
cycles (16 bytes)
[  143.482138] test 1 (128 bit key, 64 byte blocks): 1 operation in 1328 
cycles (64 bytes)
[  143.482148] test 2 (128 bit key, 256 byte blocks): 1 operation in 1922 
cycles (256 bytes)
[  143.482159] test 3 (128 bit key, 1024 byte blocks): 1 operation in 3328 
cycles (1024 bytes)
[  143.482176] test 4 (128 bit key, 8192 byte blocks): 1 operation in 19483 
cycles (8192 bytes)


As I use crypto_cipher_encrypt_one, I only send one block at a time to AES-NI.


Ciao
Stephan
--
To unsubscribe from this list: send the line "unsubscribe linux-crypto" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: AES-NI: slower than aes-generic?

2016-05-26 Thread Jeffrey Walton
> What I am wondering is that when encrypting 256 16 byte blocks, I get a speed
> of about 170 MB/s with the AES-NI driver. When using the aes-generic or aes-
> asm, I get up to 180 MB/s with all else being equal. Note, that figure
> includes a copy_to_user of the generated data.
>
> ...

Something sounds amiss.

AES-NI should be on the order of magnitude faster than a generic
implementation. Can you verify AES-NI is actually using AES-NI, and
aes-generic is a software implementation?

Here are some OpenSSL numbers. EVP uses AES-NI when available.
Omitting -evp means its software only (no hardware acceleration, like
AES-NI).

$ openssl speed -elapsed -evp aes-128-cbc
You have chosen to measure elapsed time instead of user CPU time.
...
The 'numbers' are in 1000s of bytes per second processed.
type 16 bytes 64 bytes256 bytes   1024 bytes   8192 bytes
aes-128-cbc 626533.60k   669884.42k   680917.93k   682079.91k   684736.51k


$ openssl speed -elapsed aes-128-cbc
You have chosen to measure elapsed time instead of user CPU time.
...
The 'numbers' are in 1000s of bytes per second processed.
type 16 bytes 64 bytes256 bytes   1024 bytes   8192 bytes
aes-128 cbc 106520.59k   114380.16k   116741.46k   117489.32k   117563.39k

Jeff
--
To unsubscribe from this list: send the line "unsubscribe linux-crypto" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: AES-NI: slower than aes-generic?

2016-05-26 Thread Stephan Mueller
Am Donnerstag, 26. Mai 2016, 13:25:02 schrieb Jeffrey Walton:

Hi Jeffrey,

> > What I am wondering is that when encrypting 256 16 byte blocks, I get a
> > speed of about 170 MB/s with the AES-NI driver. When using the
> > aes-generic or aes- asm, I get up to 180 MB/s with all else being equal.
> > Note, that figure includes a copy_to_user of the generated data.
> > 
> > ...
> 
> Something sounds amiss.
> 
> AES-NI should be on the order of magnitude faster than a generic
> implementation. Can you verify AES-NI is actually using AES-NI, and
> aes-generic is a software implementation?

I am pretty sure I am using the right implementations as I checked the 
refcount in /proc/crypto.
> 
> Here are some OpenSSL numbers. EVP uses AES-NI when available.
> Omitting -evp means its software only (no hardware acceleration, like
> AES-NI).

I understand that AES-NI should be faster. That is what I am wondering about.

However, the key difference to a standard speed test is that I set up a new 
key schedule quite frequently. And I would suspect that something is going on 
here...

Ciao
Stephan
--
To unsubscribe from this list: send the line "unsubscribe linux-crypto" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: key retention service: DH support

2016-05-26 Thread Mat Martineau


On Wed, 25 May 2016, David Howells wrote:


Mat Martineau  wrote:


Since the KDF patches are not yet merged, I'm not sure of the best way to
accomodate the future feature. We could future-proof KEYCTL_DH_COMPUTE by
adding a 5th arg, an optional pointer to KDF configuration (NAME and
LABEL).


If we want to do this, it needs to be done before the merge window closes,
maybe by -rc2.  Just requiring the extra argument to be 0 for now and/or
extending struct keyctl_dh_params to include some must-be-zeroed spare space
would do for now.


I sent a patch ([PATCH] KEYS: Add optional key derivation parameters for 
DH) to define the additional parameter. For now, an error is returned if 
KDF configuration is provided.


--
Mat Martineau
Intel OTC
--
To unsubscribe from this list: send the line "unsubscribe linux-crypto" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH] KEYS: Add optional key derivation parameters for DH

2016-05-26 Thread Mat Martineau
The values computed during Diffie-Hellman key exchange are often used
in combination with key derivation functions to create cryptographic
keys. Add an interface to configure a key derivation function that
will transform the Diffie-Hellman result returned by the
KEYCTL_DH_COMPUTE command.

Signed-off-by: Mat Martineau 
---
 Documentation/security/keys.txt | 9 -
 include/uapi/linux/keyctl.h | 5 +
 security/keys/compat.c  | 2 +-
 security/keys/dh.c  | 8 +++-
 security/keys/internal.h| 5 +++--
 security/keys/keyctl.c  | 4 ++--
 6 files changed, 26 insertions(+), 7 deletions(-)

diff --git a/Documentation/security/keys.txt b/Documentation/security/keys.txt
index 20d0571..47470a6 100644
--- a/Documentation/security/keys.txt
+++ b/Documentation/security/keys.txt
@@ -826,7 +826,8 @@ The keyctl syscall functions are:
  (*) Compute a Diffie-Hellman shared secret or public key
 
long keyctl(KEYCTL_DH_COMPUTE, struct keyctl_dh_params *params,
-  char *buffer, size_t buflen);
+  char *buffer, size_t buflen,
+  struct keyctl_kdf_params *kdf);
 
  The params struct contains serial numbers for three keys:
 
@@ -843,6 +844,12 @@ The keyctl syscall functions are:
  public key.  If the base is the remote public key, the result is
  the shared secret.
 
+ When kdf is NULL, the raw computed value is written to the
+ buffer. If valid kdf parameters are provided, the result of the
+ DH computation is transformed according to the requested key
+ derivation function and the transformed value is written to the
+ buffer.
+
  The buffer length must be at least the length of the prime, or zero.
 
  If the buffer length is nonzero, the length of the result is
diff --git a/include/uapi/linux/keyctl.h b/include/uapi/linux/keyctl.h
index 86eddd6..d55cf53 100644
--- a/include/uapi/linux/keyctl.h
+++ b/include/uapi/linux/keyctl.h
@@ -68,4 +68,9 @@ struct keyctl_dh_params {
__s32 base;
 };
 
+struct keyctl_kdf_params {
+   char *name;
+   __u8 reserved[32]; /* Reserved for future use, must be 0 */
+};
+
 #endif /*  _LINUX_KEYCTL_H */
diff --git a/security/keys/compat.c b/security/keys/compat.c
index c8783b3..36c80bf 100644
--- a/security/keys/compat.c
+++ b/security/keys/compat.c
@@ -134,7 +134,7 @@ COMPAT_SYSCALL_DEFINE5(keyctl, u32, option,
 
case KEYCTL_DH_COMPUTE:
return keyctl_dh_compute(compat_ptr(arg2), compat_ptr(arg3),
-arg4);
+arg4, compat_ptr(arg5));
 
default:
return -EOPNOTSUPP;
diff --git a/security/keys/dh.c b/security/keys/dh.c
index 880505a..e415d4c 100644
--- a/security/keys/dh.c
+++ b/security/keys/dh.c
@@ -78,7 +78,8 @@ error:
 }
 
 long keyctl_dh_compute(struct keyctl_dh_params __user *params,
-  char __user *buffer, size_t buflen)
+  char __user *buffer, size_t buflen,
+  struct keyctl_kdf_params __user *kdf)
 {
long ret;
MPI base, private, prime, result;
@@ -97,6 +98,11 @@ long keyctl_dh_compute(struct keyctl_dh_params __user 
*params,
goto out;
}
 
+   if (kdf) {
+   ret = -EINVAL;
+   goto out;
+   }
+
keylen = mpi_from_key(pcopy.prime, buflen, );
if (keylen < 0 || !prime) {
/* buflen == 0 may be used to query the required buffer size,
diff --git a/security/keys/internal.h b/security/keys/internal.h
index 8ec7a52..1b722 100644
--- a/security/keys/internal.h
+++ b/security/keys/internal.h
@@ -260,10 +260,11 @@ static inline long keyctl_get_persistent(uid_t uid, 
key_serial_t destring)
 
 #ifdef CONFIG_KEY_DH_OPERATIONS
 extern long keyctl_dh_compute(struct keyctl_dh_params __user *, char __user *,
- size_t);
+ size_t, struct keyctl_kdf_params __user *);
 #else
 static inline long keyctl_dh_compute(struct keyctl_dh_params __user *params,
-char __user *buffer, size_t buflen)
+char __user *buffer, size_t buflen,
+struct keyctl_kdf_params __user *kdf)
 {
return -EOPNOTSUPP;
 }
diff --git a/security/keys/keyctl.c b/security/keys/keyctl.c
index 3b135a0..b106898 100644
--- a/security/keys/keyctl.c
+++ b/security/keys/keyctl.c
@@ -1688,8 +1688,8 @@ SYSCALL_DEFINE5(keyctl, int, option, unsigned long, arg2, 
unsigned long, arg3,
 
case KEYCTL_DH_COMPUTE:
return keyctl_dh_compute((struct keyctl_dh_params __user *) 
arg2,
-(char __user *) arg3,
-(size_t) arg4);
+(char __user *) arg3, (size_t) arg4,
+ 

AES-NI: slower than aes-generic?

2016-05-26 Thread Stephan Mueller
Hi,

for the DRBG and the LRNG work I am doing, I also test the speed of the DRBG. 
The DRBG can be considered as a form of block chaining mode on top of a raw 
cipher.

What I am wondering is that when encrypting 256 16 byte blocks, I get a speed 
of about 170 MB/s with the AES-NI driver. When using the aes-generic or aes-
asm, I get up to 180 MB/s with all else being equal. Note, that figure 
includes a copy_to_user of the generated data.

To be precise, the code does the following steps:

1. setkey

2. AES encrypt 256 blocks and copy_to_user

3. Use AES to generate a new key and start at step 1.

I am wondering why the AES-NI driver is slower than the C/ASM implementations 
given that all else is equal.

Note, if I have less blocks in step 2 above, AES-NI is becoming faster. E.g. 
if I have 8 blocks or just one block in step 2 above, AES-NI is faster by 10 
or 20%.

Ciao
Stephan
-- 
atsec information security GmbH, Steinstraße 70, 81667 München, Germany
P: +49 89 442 49 830 - F: +49 89 442 49 831
M: +49 172 216 55 78 - HRB: 129439 (Amtsgericht München)
US: +1 949 545 4096
GF: Salvatore la Pietra, Staffan Persson
atsec it security news blog - atsec-information-security.blogspot.com

--
To unsubscribe from this list: send the line "unsubscribe linux-crypto" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [RFC 3/3] md: dm-crypt: Introduce the bulk mode method when sending request

2016-05-26 Thread Mike Snitzer
Comments inlined.

In general the most concerning bit is the need for memory allocation in
the IO path (see comment/question below near call to sg_alloc_table).
In DM targets we make heavy use of .ctr preallocated memory and/or
per-bio-data to avoid memory allocations in the IO path.

On Wed, May 25 2016 at  2:12am -0400,
Baolin Wang  wrote:

> In now dm-crypt code, it is ineffective to map one segment (always one
> sector) of one bio with just only one scatterlist at one time for hardware
> crypto engine. Especially for some encryption mode (like ecb or xts mode)
> cooperating with the crypto engine, they just need one initial IV or null
> IV instead of different IV for each sector. In this situation We can consider
> to use multiple scatterlists to map the whole bio and send all scatterlists
> of one bio to crypto engine to encrypt or decrypt, which can improve the
> hardware engine's efficiency.
> 
> With this optimization, On my test setup (beaglebone black board) using 64KB
> I/Os on an eMMC storage device I saw about 60% improvement in throughput for
> encrypted writes, and about 100% improvement for encrypted reads. But this
> is not fit for other modes which need different IV for each sector.
> 
> Signed-off-by: Baolin Wang 
> ---
>  drivers/md/dm-crypt.c |  188 
> +
>  1 file changed, 176 insertions(+), 12 deletions(-)
> 
> diff --git a/drivers/md/dm-crypt.c b/drivers/md/dm-crypt.c
> index 4f3cb35..1c86ea7 100644
> --- a/drivers/md/dm-crypt.c
> +++ b/drivers/md/dm-crypt.c
> @@ -33,6 +33,7 @@
>  #include 
>  
>  #define DM_MSG_PREFIX "crypt"
> +#define DM_MAX_SG_LIST   1024
>  
>  /*
>   * context holding the current state of a multi-part conversion
> @@ -46,6 +47,8 @@ struct convert_context {
>   sector_t cc_sector;
>   atomic_t cc_pending;
>   struct skcipher_request *req;
> + struct sg_table sgt_in;
> + struct sg_table sgt_out;
>  };
>  
>  /*
> @@ -803,6 +806,108 @@ static struct crypt_iv_operations crypt_iv_tcw_ops = {
>   .post  = crypt_iv_tcw_post
>  };
>  
> +/*
> + * Check how many sg entry numbers are needed when map one bio
> + * with scatterlists in advance.
> + */
> +static unsigned int crypt_sg_entry(struct bio *bio_t)
> +{
> + struct request_queue *q = bdev_get_queue(bio_t->bi_bdev);
> + int cluster = blk_queue_cluster(q);
> + struct bio_vec bvec, bvprv = { NULL };
> + struct bvec_iter biter;
> + unsigned long nbytes = 0, sg_length = 0;
> + unsigned int sg_cnt = 0, first_bvec = 0;
> +
> + if (bio_t->bi_rw & REQ_DISCARD) {
> + if (bio_t->bi_vcnt)
> + return 1;
> + return 0;
> + }
> +
> + if (bio_t->bi_rw & REQ_WRITE_SAME)
> + return 1;
> +
> + bio_for_each_segment(bvec, bio_t, biter) {
> + nbytes = bvec.bv_len;
> +
> + if (!cluster) {
> + sg_cnt++;
> + continue;
> + }
> +
> + if (!first_bvec) {
> + first_bvec = 1;
> + goto new_segment;
> + }
> +
> + if (sg_length + nbytes > queue_max_segment_size(q))
> + goto new_segment;
> +
> + if (!BIOVEC_PHYS_MERGEABLE(, ))
> + goto new_segment;
> +
> + if (!BIOVEC_SEG_BOUNDARY(q, , ))
> + goto new_segment;
> +
> + sg_length += nbytes;
> + continue;
> +
> +new_segment:
> + memcpy(, , sizeof(struct bio_vec));
> + sg_length = nbytes;
> + sg_cnt++;
> + }
> +
> + return sg_cnt;
> +}
> +
> +static int crypt_convert_alloc_table(struct crypt_config *cc,
> +  struct convert_context *ctx)
> +{
> + struct bio *bio_in = ctx->bio_in;
> + struct bio *bio_out = ctx->bio_out;
> + unsigned int mode = skcipher_is_bulk_mode(any_tfm(cc));

please use: bool bulk_mode = ...

> + unsigned int sg_in_max, sg_out_max;
> + int ret = 0;
> +
> + if (!mode)
> + goto out2;

Please use more descriptive label names than out[1-3]

> +
> + /*
> +  * Need to calculate how many sg entry need to be used
> +  * for this bio.
> +  */
> + sg_in_max = crypt_sg_entry(bio_in) + 1;

The return from crypt_sg_entry() is pretty awkward, given you just go on
to add 1; as is the bounds checking.. the magic value of 2 needs to be
be made clearer.

> + if (sg_in_max > DM_MAX_SG_LIST || sg_in_max <= 2)
> + goto out2;
> +
> + ret = sg_alloc_table(>sgt_in, sg_in_max, GFP_KERNEL);

Is it safe to be using GFP_KERNEL here?  AFAIK this is in the IO mapping
path and we try to avoid memory allocations at all costs -- due to the
risk of deadlock when issuing IO to stacked block devices (dm-crypt
could be part of a much more elaborate IO stack).

> + if (ret)
> + goto 

[PATCH 2/2] lib/mpi: mpi_read_raw_data(): fix nbits calculation

2016-05-26 Thread Nicolai Stange
The number of bits, nbits, is calculated in mpi_read_raw_data() as follows:

  nbits = nbytes * 8;

Afterwards, the number of leading zero bits of the first byte get
subtracted:

  nbits -= count_leading_zeros(buffer[0]);

However, count_leading_zeros() takes an unsigned long and thus,
the u8 gets promoted to an unsigned long.

Thus, the above doesn't subtract the number of leading zeros in the most
significant nonzero input byte from nbits, but the number of leading
zeros of the most significant nonzero input byte promoted to unsigned long,
i.e. BITS_PER_LONG - 8 too many.

Fix this by subtracting

  count_leading_zeros(...) - (BITS_PER_LONG - 8)

from nbits only.

Fixes: e1045992949 ("MPILIB: Provide a function to read raw data into an
 MPI")
Signed-off-by: Nicolai Stange 
---
 Applicable to linux-next-20150525.

 lib/mpi/mpicoder.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/lib/mpi/mpicoder.c b/lib/mpi/mpicoder.c
index c160393..d197210 100644
--- a/lib/mpi/mpicoder.c
+++ b/lib/mpi/mpicoder.c
@@ -50,7 +50,7 @@ MPI mpi_read_raw_data(const void *xbuffer, size_t nbytes)
return NULL;
}
if (nbytes > 0)
-   nbits -= count_leading_zeros(buffer[0]);
+   nbits -= count_leading_zeros(buffer[0]) - (BITS_PER_LONG - 8);
 
nlimbs = DIV_ROUND_UP(nbytes, BYTES_PER_MPI_LIMB);
val = mpi_alloc(nlimbs);
-- 
2.8.2

--
To unsubscribe from this list: send the line "unsubscribe linux-crypto" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH 1/2] lib/mpi: mpi_read_raw_data(): purge redundant clearing of nbits

2016-05-26 Thread Nicolai Stange
In mpi_read_raw_data(), unsigned nbits is calculated as follows:

 nbits = nbytes * 8;

and redundantly cleared later on if nbytes == 0:

  if (nbytes > 0)
...
  else
nbits = 0;

Purge this redundant clearing for the sake of clarity.

Signed-off-by: Nicolai Stange 
---
 Applicable to linux-next-20160525.

 lib/mpi/mpicoder.c | 2 --
 1 file changed, 2 deletions(-)

diff --git a/lib/mpi/mpicoder.c b/lib/mpi/mpicoder.c
index 747606f..c160393 100644
--- a/lib/mpi/mpicoder.c
+++ b/lib/mpi/mpicoder.c
@@ -51,8 +51,6 @@ MPI mpi_read_raw_data(const void *xbuffer, size_t nbytes)
}
if (nbytes > 0)
nbits -= count_leading_zeros(buffer[0]);
-   else
-   nbits = 0;
 
nlimbs = DIV_ROUND_UP(nbytes, BYTES_PER_MPI_LIMB);
val = mpi_alloc(nlimbs);
-- 
2.8.2

--
To unsubscribe from this list: send the line "unsubscribe linux-crypto" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH] lib/mpi: purge mpi_set_buffer()

2016-05-26 Thread Nicolai Stange
mpi_set_buffer() has no in-tree users and similar functionality is provided
by mpi_read_raw_data().

Remove mpi_set_buffer().

Signed-off-by: Nicolai Stange 
---
 Applicable to linux-next-20150525.
 allmodconfig build on x86_64 succeeded.

 include/linux/mpi.h |  1 -
 lib/mpi/mpicoder.c  | 76 -
 2 files changed, 77 deletions(-)

diff --git a/include/linux/mpi.h b/include/linux/mpi.h
index 3a5abe9..f219559 100644
--- a/include/linux/mpi.h
+++ b/include/linux/mpi.h
@@ -80,7 +80,6 @@ void *mpi_get_buffer(MPI a, unsigned *nbytes, int *sign);
 int mpi_read_buffer(MPI a, uint8_t *buf, unsigned buf_len, unsigned *nbytes,
int *sign);
 void *mpi_get_secure_buffer(MPI a, unsigned *nbytes, int *sign);
-int mpi_set_buffer(MPI a, const void *buffer, unsigned nbytes, int sign);
 int mpi_write_to_sgl(MPI a, struct scatterlist *sg, unsigned *nbytes,
 int *sign);
 
diff --git a/lib/mpi/mpicoder.c b/lib/mpi/mpicoder.c
index 747606f..c742033 100644
--- a/lib/mpi/mpicoder.c
+++ b/lib/mpi/mpicoder.c
@@ -250,82 +250,6 @@ void *mpi_get_buffer(MPI a, unsigned *nbytes, int *sign)
 }
 EXPORT_SYMBOL_GPL(mpi_get_buffer);
 
-/
- * Use BUFFER to update MPI.
- */
-int mpi_set_buffer(MPI a, const void *xbuffer, unsigned nbytes, int sign)
-{
-   const uint8_t *buffer = xbuffer, *p;
-   mpi_limb_t alimb;
-   int nlimbs;
-   int i;
-
-   nlimbs = DIV_ROUND_UP(nbytes, BYTES_PER_MPI_LIMB);
-   if (RESIZE_IF_NEEDED(a, nlimbs) < 0)
-   return -ENOMEM;
-   a->sign = sign;
-
-   for (i = 0, p = buffer + nbytes - 1; p >= buffer + BYTES_PER_MPI_LIMB;) 
{
-#if BYTES_PER_MPI_LIMB == 4
-   alimb = (mpi_limb_t) *p--;
-   alimb |= (mpi_limb_t) *p-- << 8;
-   alimb |= (mpi_limb_t) *p-- << 16;
-   alimb |= (mpi_limb_t) *p-- << 24;
-#elif BYTES_PER_MPI_LIMB == 8
-   alimb = (mpi_limb_t) *p--;
-   alimb |= (mpi_limb_t) *p-- << 8;
-   alimb |= (mpi_limb_t) *p-- << 16;
-   alimb |= (mpi_limb_t) *p-- << 24;
-   alimb |= (mpi_limb_t) *p-- << 32;
-   alimb |= (mpi_limb_t) *p-- << 40;
-   alimb |= (mpi_limb_t) *p-- << 48;
-   alimb |= (mpi_limb_t) *p-- << 56;
-#else
-#error please implement for this limb size.
-#endif
-   a->d[i++] = alimb;
-   }
-   if (p >= buffer) {
-#if BYTES_PER_MPI_LIMB == 4
-   alimb = *p--;
-   if (p >= buffer)
-   alimb |= (mpi_limb_t) *p-- << 8;
-   if (p >= buffer)
-   alimb |= (mpi_limb_t) *p-- << 16;
-   if (p >= buffer)
-   alimb |= (mpi_limb_t) *p-- << 24;
-#elif BYTES_PER_MPI_LIMB == 8
-   alimb = (mpi_limb_t) *p--;
-   if (p >= buffer)
-   alimb |= (mpi_limb_t) *p-- << 8;
-   if (p >= buffer)
-   alimb |= (mpi_limb_t) *p-- << 16;
-   if (p >= buffer)
-   alimb |= (mpi_limb_t) *p-- << 24;
-   if (p >= buffer)
-   alimb |= (mpi_limb_t) *p-- << 32;
-   if (p >= buffer)
-   alimb |= (mpi_limb_t) *p-- << 40;
-   if (p >= buffer)
-   alimb |= (mpi_limb_t) *p-- << 48;
-   if (p >= buffer)
-   alimb |= (mpi_limb_t) *p-- << 56;
-#else
-#error please implement for this limb size.
-#endif
-   a->d[i++] = alimb;
-   }
-   a->nlimbs = i;
-
-   if (i != nlimbs) {
-   pr_emerg("MPI: mpi_set_buffer: Assertion failed (%d != %d)", i,
-  nlimbs);
-   BUG();
-   }
-   return 0;
-}
-EXPORT_SYMBOL_GPL(mpi_set_buffer);
-
 /**
  * mpi_write_to_sgl() - Funnction exports MPI to an sgl (msb first)
  *
-- 
2.8.2

--
To unsubscribe from this list: send the line "unsubscribe linux-crypto" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: SKB dst field is NULL when AEAD request complete() is called

2016-05-26 Thread Denis B
The bug was: return value of my driver's encrypt() function should
have been -EINPROGRESS to support asynchronous operation.

Thanks.

On Thu, May 26, 2016 at 12:29 PM, Herbert Xu
 wrote:
> Denis B  wrote:
>> Working with kernel 3.12.14, in AEAD mode, I register my crypto driver
>> and the givencrypt() method in the driver gets called when I send
>> IPSec traffic. I store the request, and later call its complete()
>> method from a work queue. There is no actual encryption happening at
>> the moment, I'm just testing flow. As stated, the complete() call
>> stumbles upon a NULL pointer exception in xfrm_output_resume() because
>> skb_dst(skb) is NULL. When I receive the request in givencrypt(), dst
>> is not null in the SKB.
>>
>> Why would the framework meddle with the SKB? Has anyone experienced
>> anything similar?
>
> This is probably the result of some kind of a bug in your driver.
>
> If however you're sure that your driver is doing the right thing,
> then the bug would be in the network stack.  You should discuss
> networking issues on net...@vger.kernel.org.
>
> Cheers,
> --
> Email: Herbert Xu 
> Home Page: http://gondor.apana.org.au/~herbert/
> PGP Key: http://gondor.apana.org.au/~herbert/pubkey.txt
--
To unsubscribe from this list: send the line "unsubscribe linux-crypto" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: SKB dst field is NULL when AEAD request complete() is called

2016-05-26 Thread Herbert Xu
Denis B  wrote:
> Working with kernel 3.12.14, in AEAD mode, I register my crypto driver
> and the givencrypt() method in the driver gets called when I send
> IPSec traffic. I store the request, and later call its complete()
> method from a work queue. There is no actual encryption happening at
> the moment, I'm just testing flow. As stated, the complete() call
> stumbles upon a NULL pointer exception in xfrm_output_resume() because
> skb_dst(skb) is NULL. When I receive the request in givencrypt(), dst
> is not null in the SKB.
> 
> Why would the framework meddle with the SKB? Has anyone experienced
> anything similar?

This is probably the result of some kind of a bug in your driver.

If however you're sure that your driver is doing the right thing,
then the bug would be in the network stack.  You should discuss
networking issues on net...@vger.kernel.org.

Cheers,
-- 
Email: Herbert Xu 
Home Page: http://gondor.apana.org.au/~herbert/
PGP Key: http://gondor.apana.org.au/~herbert/pubkey.txt
--
To unsubscribe from this list: send the line "unsubscribe linux-crypto" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html