Re: [PATCH 1/6] virtio: wrap find_vqs

2017-03-30 Thread Jason Wang



On 2017年03月30日 22:32, Michael S. Tsirkin wrote:

On Thu, Mar 30, 2017 at 02:00:08PM +0800, Jason Wang wrote:


On 2017年03月30日 04:48, Michael S. Tsirkin wrote:

We are going to add more parameters to find_vqs, let's wrap the call so
we don't need to tweak all drivers every time.

Signed-off-by: Michael S. Tsirkin
---

A quick glance and it looks ok, but what the benefit of this series, is it
required by other changes?

Thanks

Yes - to avoid touching all devices when doing the rest of
the patchset.


Maybe I'm not clear. I mean the benefit of this series not this single 
patch. I guess it may be used by you proposal that avoid reset when set 
XDP? If yes, do we really want to drop some packets after XDP is set?


Thanks


Re: [PATCH v2] crypto: gf128mul - define gf128mul_x_* in gf128mul.h

2017-03-30 Thread Eric Biggers
On Fri, Mar 31, 2017 at 12:04:42AM +0200, Ondrej Mosnacek wrote:
> The gf128mul_x_ble function is currently defined in gf128mul.c, because
> it depends on the gf128mul_table_be multiplication table.
> 
> However, since the function is very small and only uses two values from
> the table, it is better for it to be defined as inline function in
> gf128mul.h. That way, the function can be inlined by the compiler for
> better performance.
> 
> For consistency, the other gf128mul_x_* functions are also moved to the
> header file.
> 
> After this change, the speed of the generic 'xts(aes)' implementation
> increased from ~225 MiB/s to ~235 MiB/s (measured using 'cryptsetup
> benchmark -c aes-xts-plain64' on an Intel system with CRYPTO_AES_X86_64
> and CRYPTO_AES_NI_INTEL disabled).
> 

Reviewed-by: Eric Biggers 

Thanks,

- Eric


[PATCH v2] crypto: gf128mul - define gf128mul_x_* in gf128mul.h

2017-03-30 Thread Ondrej Mosnacek
The gf128mul_x_ble function is currently defined in gf128mul.c, because
it depends on the gf128mul_table_be multiplication table.

However, since the function is very small and only uses two values from
the table, it is better for it to be defined as inline function in
gf128mul.h. That way, the function can be inlined by the compiler for
better performance.

For consistency, the other gf128mul_x_* functions are also moved to the
header file.

After this change, the speed of the generic 'xts(aes)' implementation
increased from ~225 MiB/s to ~235 MiB/s (measured using 'cryptsetup
benchmark -c aes-xts-plain64' on an Intel system with CRYPTO_AES_X86_64
and CRYPTO_AES_NI_INTEL disabled).

Signed-off-by: Ondrej Mosnacek 
Cc: Eric Biggers 
---
 crypto/gf128mul.c | 33 +--
 include/crypto/gf128mul.h | 49 +--
 2 files changed, 48 insertions(+), 34 deletions(-)

diff --git a/crypto/gf128mul.c b/crypto/gf128mul.c
index 04facc0..dc01212 100644
--- a/crypto/gf128mul.c
+++ b/crypto/gf128mul.c
@@ -130,43 +130,12 @@ static const u16 gf128mul_table_le[256] = 
gf128mul_dat(xda_le);
 static const u16 gf128mul_table_be[256] = gf128mul_dat(xda_be);
 
 /*
- * The following functions multiply a field element by x or by x^8 in
+ * The following functions multiply a field element by x^8 in
  * the polynomial field representation.  They use 64-bit word operations
  * to gain speed but compensate for machine endianness and hence work
  * correctly on both styles of machine.
  */
 
-static void gf128mul_x_lle(be128 *r, const be128 *x)
-{
-   u64 a = be64_to_cpu(x->a);
-   u64 b = be64_to_cpu(x->b);
-   u64 _tt = gf128mul_table_le[(b << 7) & 0xff];
-
-   r->b = cpu_to_be64((b >> 1) | (a << 63));
-   r->a = cpu_to_be64((a >> 1) ^ (_tt << 48));
-}
-
-static void gf128mul_x_bbe(be128 *r, const be128 *x)
-{
-   u64 a = be64_to_cpu(x->a);
-   u64 b = be64_to_cpu(x->b);
-   u64 _tt = gf128mul_table_be[a >> 63];
-
-   r->a = cpu_to_be64((a << 1) | (b >> 63));
-   r->b = cpu_to_be64((b << 1) ^ _tt);
-}
-
-void gf128mul_x_ble(be128 *r, const be128 *x)
-{
-   u64 a = le64_to_cpu(x->a);
-   u64 b = le64_to_cpu(x->b);
-   u64 _tt = gf128mul_table_be[b >> 63];
-
-   r->a = cpu_to_le64((a << 1) ^ _tt);
-   r->b = cpu_to_le64((b << 1) | (a >> 63));
-}
-EXPORT_SYMBOL(gf128mul_x_ble);
-
 static void gf128mul_x8_lle(be128 *x)
 {
u64 a = be64_to_cpu(x->a);
diff --git a/include/crypto/gf128mul.h b/include/crypto/gf128mul.h
index 0bc9b5f..2a24553 100644
--- a/include/crypto/gf128mul.h
+++ b/include/crypto/gf128mul.h
@@ -49,6 +49,7 @@
 #ifndef _CRYPTO_GF128MUL_H
 #define _CRYPTO_GF128MUL_H
 
+#include 
 #include 
 #include 
 
@@ -163,8 +164,52 @@ void gf128mul_lle(be128 *a, const be128 *b);
 
 void gf128mul_bbe(be128 *a, const be128 *b);
 
-/* multiply by x in ble format, needed by XTS */
-void gf128mul_x_ble(be128 *a, const be128 *b);
+/*
+ * The following functions multiply a field element by x in
+ * the polynomial field representation.  They use 64-bit word operations
+ * to gain speed but compensate for machine endianness and hence work
+ * correctly on both styles of machine.
+ *
+ * They are defined here for performance.
+ */
+
+static inline void gf128mul_x_lle(be128 *r, const be128 *x)
+{
+   u64 a = be64_to_cpu(x->a);
+   u64 b = be64_to_cpu(x->b);
+
+   /* equivalent to gf128mul_table_le[(b << 7) & 0xff] >> 8
+* (see crypto/gf128mul.c): */
+   u64 _tt = (b & (u64)1) ? 0xe1 : 0x00;
+
+   r->b = cpu_to_be64((b >> 1) | (a << 63));
+   r->a = cpu_to_be64((a >> 1) ^ (_tt << 56));
+}
+
+static inline void gf128mul_x_bbe(be128 *r, const be128 *x)
+{
+   u64 a = be64_to_cpu(x->a);
+   u64 b = be64_to_cpu(x->b);
+
+   /* equivalent to gf128mul_table_be[a >> 63] (see crypto/gf128mul.c): */
+   u64 _tt = (a & ((u64)1 << 63)) ? 0x87 : 0x00;
+
+   r->a = cpu_to_be64((a << 1) | (b >> 63));
+   r->b = cpu_to_be64((b << 1) ^ _tt);
+}
+
+/* needed by XTS */
+static inline void gf128mul_x_ble(be128 *r, const be128 *x)
+{
+   u64 a = le64_to_cpu(x->a);
+   u64 b = le64_to_cpu(x->b);
+
+   /* equivalent to gf128mul_table_be[b >> 63] (see crypto/gf128mul.c): */
+   u64 _tt = (b & ((u64)1 << 63)) ? 0x87 : 0x00;
+
+   r->a = cpu_to_le64((a << 1) ^ _tt);
+   r->b = cpu_to_le64((b << 1) | (a >> 63));
+}
 
 /* 4k table optimization */
 
-- 
2.9.3



Re: [PATCH] crypto: gf128mul - define gf128mul_x_ble in gf128mul.h

2017-03-30 Thread Ondrej Mosnáček
Hi Eric,

2017-03-30 21:55 GMT+02:00 Eric Biggers :
> This is an improvement; I'm just thinking that maybe this should be done for 
> all
> the gf128mul_x_*() functions, if only so that they use a consistent style and
> are all defined next to each other.

Right, that doesn't seem to be a bad idea... I was confused for a
while by the '& 0xff' in the _lle one, but now I see it also uses just
two values of the table, so it can be re-written in a similar way. In
fact, the OCB mode from RFC 7253 (that I'm currently trying to port to
kernel crypto API) uses gf128mul_x_bbe, so it would be useful to have
that one accessible, too.

I will move them all in v2, then.

> Also note that '(b & ((u64)1 << 63)) ? 0x87 : 0x00;' is actually getting
> compiled as '((s64)b >> 63) & 0x87', which is branchless and therefore makes 
> the
> new version more efficient than one might expect:
>
> sar$0x3f,%rax
> and$0x87,%eax
>
> It could even be written the branchless way explicitly, but it shouldn't 
> matter.

I think the definition using unsigned operations is more intuitive...
Let's just leave the clever tricks up to the compiler :)

Thanks,
O.M.

>
> - Eric


Re: [PATCH] crypto: gf128mul - define gf128mul_x_ble in gf128mul.h

2017-03-30 Thread Eric Biggers
Hi Ondrej,

On Thu, Mar 30, 2017 at 09:25:35PM +0200, Ondrej Mosnacek wrote:
> The gf128mul_x_ble function is currently defined in gf128mul.c, because
> it depends on the gf128mul_table_be multiplication table.
> 
> However, since the function is very small and only uses two values from
> the table, it is better for it to be defined as inline function in
> gf128mul.h. That way, the function can be inlined by the compiler for
> better performance.
> 
> After this change, the speed of the generic 'xts(aes)' implementation
> increased from ~225 MiB/s to ~235 MiB/s (measured using 'cryptsetup
> benchmark' on an Intel system with CRYPTO_AES_X86_64 and
> CRYPTO_AES_NI_INTEL disabled).
> 
> Signed-off-by: Ondrej Mosnacek 
...
>  
> -/* multiply by x in ble format, needed by XTS */
> -void gf128mul_x_ble(be128 *a, const be128 *b);
> +/* Multiply by x in ble format, needed by XTS.
> + * Defined here for performance. */
> +static inline void gf128mul_x_ble(be128 *r, const be128 *x)
> +{
> + u64 a = le64_to_cpu(x->a);
> + u64 b = le64_to_cpu(x->b);
> + /* equivalent to gf128mul_table_be[b >> 63] (see crypto/gf128mul.c): */
> + u64 _tt = (b & ((u64)1 << 63)) ? 0x87 : 0x00;
> +
> + r->a = cpu_to_le64((a << 1) ^ _tt);
> + r->b = cpu_to_le64((b << 1) | (a >> 63));
> +}
>  
>  /* 4k table optimization */
>  
> -- 
> 2.9.3
> 

This is an improvement; I'm just thinking that maybe this should be done for all
the gf128mul_x_*() functions, if only so that they use a consistent style and
are all defined next to each other.

Also note that '(b & ((u64)1 << 63)) ? 0x87 : 0x00;' is actually getting
compiled as '((s64)b >> 63) & 0x87', which is branchless and therefore makes the
new version more efficient than one might expect:

sar$0x3f,%rax
and$0x87,%eax

It could even be written the branchless way explicitly, but it shouldn't matter.

- Eric


[PATCH] crypto: gf128mul - define gf128mul_x_ble in gf128mul.h

2017-03-30 Thread Ondrej Mosnacek
The gf128mul_x_ble function is currently defined in gf128mul.c, because
it depends on the gf128mul_table_be multiplication table.

However, since the function is very small and only uses two values from
the table, it is better for it to be defined as inline function in
gf128mul.h. That way, the function can be inlined by the compiler for
better performance.

After this change, the speed of the generic 'xts(aes)' implementation
increased from ~225 MiB/s to ~235 MiB/s (measured using 'cryptsetup
benchmark' on an Intel system with CRYPTO_AES_X86_64 and
CRYPTO_AES_NI_INTEL disabled).

Signed-off-by: Ondrej Mosnacek 
---
 crypto/gf128mul.c | 11 ---
 include/crypto/gf128mul.h | 15 +--
 2 files changed, 13 insertions(+), 13 deletions(-)

diff --git a/crypto/gf128mul.c b/crypto/gf128mul.c
index 04facc0..2eab1a1 100644
--- a/crypto/gf128mul.c
+++ b/crypto/gf128mul.c
@@ -156,17 +156,6 @@ static void gf128mul_x_bbe(be128 *r, const be128 *x)
r->b = cpu_to_be64((b << 1) ^ _tt);
 }
 
-void gf128mul_x_ble(be128 *r, const be128 *x)
-{
-   u64 a = le64_to_cpu(x->a);
-   u64 b = le64_to_cpu(x->b);
-   u64 _tt = gf128mul_table_be[b >> 63];
-
-   r->a = cpu_to_le64((a << 1) ^ _tt);
-   r->b = cpu_to_le64((b << 1) | (a >> 63));
-}
-EXPORT_SYMBOL(gf128mul_x_ble);
-
 static void gf128mul_x8_lle(be128 *x)
 {
u64 a = be64_to_cpu(x->a);
diff --git a/include/crypto/gf128mul.h b/include/crypto/gf128mul.h
index 0bc9b5f..46a01a2 100644
--- a/include/crypto/gf128mul.h
+++ b/include/crypto/gf128mul.h
@@ -49,6 +49,7 @@
 #ifndef _CRYPTO_GF128MUL_H
 #define _CRYPTO_GF128MUL_H
 
+#include 
 #include 
 #include 
 
@@ -163,8 +164,18 @@ void gf128mul_lle(be128 *a, const be128 *b);
 
 void gf128mul_bbe(be128 *a, const be128 *b);
 
-/* multiply by x in ble format, needed by XTS */
-void gf128mul_x_ble(be128 *a, const be128 *b);
+/* Multiply by x in ble format, needed by XTS.
+ * Defined here for performance. */
+static inline void gf128mul_x_ble(be128 *r, const be128 *x)
+{
+   u64 a = le64_to_cpu(x->a);
+   u64 b = le64_to_cpu(x->b);
+   /* equivalent to gf128mul_table_be[b >> 63] (see crypto/gf128mul.c): */
+   u64 _tt = (b & ((u64)1 << 63)) ? 0x87 : 0x00;
+
+   r->a = cpu_to_le64((a << 1) ^ _tt);
+   r->b = cpu_to_le64((b << 1) | (a >> 63));
+}
 
 /* 4k table optimization */
 
-- 
2.9.3



Re: [PATCH] crypto: vmx: Remove dubiously licensed crypto code

2017-03-30 Thread Paulo Flabiano Smorigo

On 2017-03-29 20:08, Tyrel Datwyler wrote:

On 03/29/2017 08:13 AM, Michal Suchánek wrote:

On Wed, 29 Mar 2017 16:51:35 +0200
Greg Kroah-Hartman  wrote:


On Wed, Mar 29, 2017 at 02:56:39PM +0200, Michal Suchanek wrote:

While reviewing commit 11c6e16ee13a ("crypto: vmx - Adding asm
subroutines for XTS") which adds the OpenSSL license header to
drivers/crypto/vmx/aesp8-ppc.pl licensing of this driver came into
qestion. The whole license reads:

 # Licensed under the OpenSSL license (the "License").  You may not
use # this file except in compliance with the License.  You can
obtain a # copy
 # in the file LICENSE in the source distribution or at
 # https://www.openssl.org/source/license.html

 #
 #

# Written by Andy Polyakov  for the OpenSSL #
project. The module is, however, dual licensed under OpenSSL and #
CRYPTOGAMS licenses depending on where you obtain it. For further #
details see http://www.openssl.org/~appro/cryptogams/. #


After seeking legal advice it is still not clear that this driver
can be legally used in Linux. In particular the "depending on where
you obtain it" part does not make it clear when you can apply the
GPL and when the OpenSSL license.

I tried contacting the author of the code for clarification but did
not hear back. In absence of clear licensing the only solution I
see is removing this code.


A quick 'git grep OpenSSL' of the Linux tree returns several other
crypto files under the ARM architecture that are similarly licensed. 
Namely:


arch/arm/crypto/sha1-armv4-large.S
arch/arm/crypto/sha256-armv4.pl
arch/arm/crypto/sha256-core.S_shipped
arch/arm/crypto/sha512-armv4.pl
arch/arm/crypto/sha512-core.S_shipped
arch/arm64/crypto/sha256-core.S_shipped
arch/arm64/crypto/sha512-armv8.pl
arch/arm64/crypto/sha512-core.S_shipped

On closer inspection of some of those files have the addendum that
"Permission to use under GPL terms is granted", but not all of them.

-Tyrel


In 2015, Andy Polyakov, the author, replied in this mailing list [1]:

"I have no problems with reusing assembly modules in kernel context. The
whole idea behind cryptogams initiative was exactly to reuse code in
different contexts."

[1] https://patchwork.kernel.org/patch/6027481/

--
Paulo Flabiano Smorigo
IBM Linux Technology Center



Re: [PATCH 1/6] virtio: wrap find_vqs

2017-03-30 Thread Michael S. Tsirkin
On Thu, Mar 30, 2017 at 02:00:08PM +0800, Jason Wang wrote:
> 
> 
> On 2017年03月30日 04:48, Michael S. Tsirkin wrote:
> > We are going to add more parameters to find_vqs, let's wrap the call so
> > we don't need to tweak all drivers every time.
> > 
> > Signed-off-by: Michael S. Tsirkin
> > ---
> 
> A quick glance and it looks ok, but what the benefit of this series, is it
> required by other changes?
> 
> Thanks

Yes - to avoid touching all devices when doing the rest of
the patchset.


Re: [PATCH v5 1/2] crypto: skcipher AF_ALG - overhaul memory management

2017-03-30 Thread Stephan Müller
Am Donnerstag, 16. März 2017, 10:52:48 CEST schrieb Herbert Xu:

Hi Herbert,

> More importantly, with the current code, a very large recvmsg
> would still work by doing it piecemeal.  With your patch, won't
> it fail because sock_kmalloc could fail to allocate memory for
> the whole thing?

For testing purpose, I wrote the app that is attached. It simply encrypts a 
given file with ctr(aes).

On the current implementation of algif_skcipher where I directly pipe in the 
provided memory block (i.e. using the stream operation of libkcapi):

dd if=/dev/zero of=testfile bs=1024 count=100
./kcapi-long testfile testfile.out
encryption failed with error -14

==> The -EFAULT happens at sendmsg() -- i.e. you cannot provide as much data 
as you want.

==> On the proposed update, I see the same error.

When I use vmsplice, the current implementation somehow simply waits (I do not 
yet know what happens). My new implementation simply returns the amount of 
data that could have been spliced (64k -- which would allow user space to 
implement a loop to send chunks).


When I ask libkcapi to send the data in chunks, libkcapi implements the loop 
that sends/receives the data. In this case, both implementations of 
algif_skcipher.c work to encrypt the whole file regardless of its size.

Thus, I would conclude that the current outer loop in recvmsg of the current 
algif_skcipher is not really helpful as the bottleneck is on the sendmsg side.


With this, I would conclude that the new implementation of algif_skcipher.c 
proposed in this patch set has the same behavior as the old one.

Ciao
Stephan/*
 * Copyright (C) 2017, Stephan Mueller 
 *
 * License: see LICENSE file in root directory
 *
 * THIS SOFTWARE IS PROVIDED ``AS IS'' AND ANY EXPRESS OR IMPLIED
 * WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED WARRANTIES
 * OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE, ALL OF
 * WHICH ARE HEREBY DISCLAIMED.  IN NO EVENT SHALL THE AUTHOR BE
 * LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR
 * CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT
 * OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR
 * BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY THEORY OF
 * LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT
 * (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE
 * USE OF THIS SOFTWARE, EVEN IF NOT ADVISED OF THE POSSIBILITY OF SUCH
 * DAMAGE.
 */

#include 
#include 
#include 
#include 
#include 
#include 
#include 
#include 
#include 

#include 

static int check_filetype(int fd, struct stat *sb, const char *filename)
{
	fstat(fd, sb);

	/* Do not return an error in case we cannot validate the data. */
	if ((sb->st_mode & S_IFMT) != S_IFREG &&
	(sb->st_mode & S_IFMT) != S_IFLNK) {
		fprintf(stderr, "%s is no regular file or symlink\n", filename);
		return -EINVAL;
	}

	return 0;
}

static int crypt(struct kcapi_handle *handle, const uint8_t *iv,
		 const char *infile, const char *outfile)
{
	int infd = -1, outfd = -1;
	int ret = 0;
	struct stat insb, outsb;
	uint8_t *inmem = NULL, *outmem = NULL;
	size_t outsize;

	infd = open(infile, O_RDONLY | O_CLOEXEC);
	if (infd < 0) {
		fprintf(stderr, "Cannot open file %s: %s\n", infile,
			strerror(errno));
		return -EIO;
	}

	outfd = open(outfile, O_RDWR | O_CLOEXEC | O_CREAT, S_IRWXU | S_IRWXG | S_IRWXO);
	if (outfd < 0) {
		fprintf(stderr, "Cannot open file %s: %s\n", infile,
			strerror(errno));
		ret = -EIO;
		goto out;
	}

	ret = check_filetype(infd, , infile);
	if (ret)
		goto out;

	ret = check_filetype(outfd, , outfile);
	if (ret)
		goto out;

	if (insb.st_size) {
		inmem = mmap(NULL, insb.st_size, PROT_READ, MAP_SHARED,
			 infd, 0);
		if (inmem == MAP_FAILED)
		{
			fprintf(stderr, "Use of mmap failed\n");
			ret = -ENOMEM;
			goto out;
		}
	}
	outsize = ((insb.st_size + kcapi_cipher_blocksize(handle) - 1) /
		   kcapi_cipher_blocksize(handle)) *
		kcapi_cipher_blocksize(handle);

	ret = ftruncate(outfd, outsize);
	if (ret)
		goto out;

	if (outsize) {
		outmem = mmap(NULL, outsize, PROT_WRITE, MAP_SHARED,
			 outfd, 0);
		if (outmem == MAP_FAILED)
		{
			fprintf(stderr, "Use of mmap failed\n");
			ret = -ENOMEM;
			goto out;
		}
	}

#if 1
	/* Send all data in one go, libkcapi will not loop */
	struct iovec iniov, outiov;

	iniov.iov_base = inmem;
	iniov.iov_len = insb.st_size;
	outiov.iov_base = outmem;
	outiov.iov_len = outsize;

	ret = kcapi_cipher_stream_init_enc(handle, iv, NULL, 0);
	if (ret)
		goto out;

	ret = kcapi_cipher_stream_update(handle, , 1);
	if (ret)
		goto out;

	ret = kcapi_cipher_stream_op(handle, , 1);
#else
	/* libkcapi will loop over the data and send it in chunks */
	ret = kcapi_cipher_encrypt(handle, inmem, insb.st_size, iv,
   outmem, outsize, KCAPI_ACCESS_SENDMSG);
#endif

out:
	if (inmem && inmem != MAP_FAILED)
		munmap(inmem, insb.st_size);
	if (outmem && outmem != MAP_FAILED)
		munmap(outmem, outsize);
	if (infd 

Re: [PATCH 1/6] virtio: wrap find_vqs

2017-03-30 Thread Cornelia Huck
On Wed, 29 Mar 2017 23:48:44 +0300
"Michael S. Tsirkin"  wrote:

> We are going to add more parameters to find_vqs, let's wrap the call so
> we don't need to tweak all drivers every time.
> 
> Signed-off-by: Michael S. Tsirkin 
> ---
>  drivers/block/virtio_blk.c | 3 +--
>  drivers/char/virtio_console.c  | 6 +++---
>  drivers/crypto/virtio/virtio_crypto_core.c | 3 +--
>  drivers/gpu/drm/virtio/virtgpu_kms.c   | 3 +--
>  drivers/net/caif/caif_virtio.c | 3 +--
>  drivers/net/virtio_net.c   | 3 +--
>  drivers/rpmsg/virtio_rpmsg_bus.c   | 2 +-
>  drivers/scsi/virtio_scsi.c | 3 +--
>  drivers/virtio/virtio_balloon.c| 3 +--
>  drivers/virtio/virtio_input.c  | 3 +--
>  include/linux/virtio_config.h  | 9 +
>  net/vmw_vsock/virtio_transport.c   | 6 +++---
>  12 files changed, 24 insertions(+), 23 deletions(-)

Regardless whether that context thing is the right thing to do, this
looks like a sensible cleanup.



RE: [RFC TLS Offload Support 00/15] cover letter

2017-03-30 Thread Boris Pismenny
> >> TLS Tx crypto offload is a new feature of network devices. It enables
> >> the kernel TLS socket to skip encryption and authentication
> >> operations on the transmit side of the data path, delegating those to
> >> the NIC. In turn, the NIC encrypts packets that belong to an
> >> offloaded TLS socket on the fly. The NIC does not modify any packet
> >> headers. It expects to receive fully framed TCP packets with TLS
> >> records as payload. The NIC replaces plaintext with ciphertext and
> >> fills the authentication tag. The NIC does not hold any state beyond
> >> the context needed to encrypt the next expected packet, i.e. expected
> >> TCP sequence number and crypto state.
> >
> > It seems like, since you do the TLS framing in TCP and the card is
> > expecting to fill in certain aspects, there is a requirement that the
> > packet contents aren't mangled between the TLS framing code and when
> > the SKB hits the card.
> >
> > Is this right?
> >
> > For example, what happens if netfilter splits a TLS Tx offloaded frame
> > into two TCP segments?
We maintain the crypto context by tracking TCP sequence numbers, splitting
TCP segments is not a problem. Even if reordering is introduced anywhere
between TCP and the driver, we can identify it according to the TCP
sequence number and handle it gracefully – see mlx_tls_tx_handler.

> 
> Furthermore, it doesn't seem to work with bonding or any other virtual
> interface, which could move the skb's to be processed on another NIC, as the
> context is put onto the NIC. Even a redirect can not be processed anymore
> (seems like those patches try to stick the connection to an interface anyway).
> 
> Wouldn't it be possible to keep the state in software and push down a
> security context per skb, which get applied during sending? If not possible 
> via
> hw, slowpath can encrypt packet in sw.
We do have all the state required to encrypt a TLS packet in software. But,
pushing down the state for each skb is too expansive, because the state
depends on all data in the TLS record, essentially it requires to resend the
record up to that skb. This is accomplished for OOO packets in the
“handle_ooo” function in mlx_tls.

Maybe we could use that functionality to handle bonding, but at first it would
be easier to prevent it.

The slowpath you’ve mentioned is tricky, because you need to decide in
advance that a TLS record will use the slowpath, because after the plaintext
TLS record is pushed into TCP it is difficult to fallback to software crypto.
> 
> Also sticking connections to outgoing interfaces might work for TX, but you
> can't force the interface where packets come in.
Right. This RFC handles only TX offload.
> 
> Bye,
> Hannes

Thanks,
Boris



Re: [PATCH 1/6] virtio: wrap find_vqs

2017-03-30 Thread Jason Wang



On 2017年03月30日 04:48, Michael S. Tsirkin wrote:

We are going to add more parameters to find_vqs, let's wrap the call so
we don't need to tweak all drivers every time.

Signed-off-by: Michael S. Tsirkin
---


A quick glance and it looks ok, but what the benefit of this series, is 
it required by other changes?


Thanks