Re: comparison of the AF_ALG interface with the /dev/crypto

2011-09-01 Thread Nikos Mavrogiannopoulos

On 09/01/2011 04:15 AM, Herbert Xu wrote:

Nikos Mavrogiannopoulosn...@gnutls.org  wrote:


Given my benchmarks have no issues, it is not apparent to me why one
should use AF_ALG instead of cryptodev. I do not know though why AF_ALG
performs so poor. I'd speculate by blaming it on the usage of the socket
API and the number of system calls required.

The target usage of AF_ALG is hardware offload devices that cannot
be directly used in user-space, not software crypto on implementations
such as AESNI/Padlock.
Going through the kernel to use something like AESNI/Padlock or
software crypto is insane.
Given the intended target case, your numbers are pretty much
meaningless as cryptodev's performance can be easily beaten
by a pure user-space implementation.


Actually this is the reason of the ecb(cipher-null) comparison. To 
emulate the case of a hardware offload device. I tried to make that 
clear in the text, but may not be. If you see AF_ALG performs really bad 
on that case. It performs better when a software or a padlock 
implementation of AES is involved (which as you say it is a useless 
use-case).


Of course, I don't own such an offloading device and cannot test it 
directly. If you have different values from a benchmark with an actual 
hardware accelerator, I'll be happy to include them.


regards,
Nikos
--
To unsubscribe from this list: send the line unsubscribe linux-crypto in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: comparison of the AF_ALG interface with the /dev/crypto

2011-09-01 Thread Herbert Xu
On Thu, Sep 01, 2011 at 08:26:07AM +0200, Nikos Mavrogiannopoulos wrote:

 Actually this is the reason of the ecb(cipher-null) comparison. To  
 emulate the case of a hardware offload device. I tried to make that  
 clear in the text, but may not be. If you see AF_ALG performs really bad  
 on that case. It performs better when a software or a padlock  
 implementation of AES is involved (which as you say it is a useless  
 use-case).

It's meaningless because such devices operate at a rate much
lower than the figures you give.

Cheers,
-- 
Email: Herbert Xu herb...@gondor.apana.org.au
Home Page: http://gondor.apana.org.au/~herbert/
PGP Key: http://gondor.apana.org.au/~herbert/pubkey.txt
--
To unsubscribe from this list: send the line unsubscribe linux-crypto in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: comparison of the AF_ALG interface with the /dev/crypto

2011-09-01 Thread Nikos Mavrogiannopoulos

On 09/01/2011 08:43 AM, Herbert Xu wrote:

On Thu, Sep 01, 2011 at 08:26:07AM +0200, Nikos Mavrogiannopoulos wrote:


Actually this is the reason of the ecb(cipher-null) comparison. To
emulate the case of a hardware offload device. I tried to make that
clear in the text, but may not be. If you see AF_ALG performs really bad
on that case. It performs better when a software or a padlock
implementation of AES is involved (which as you say it is a useless
use-case).

It's meaningless because such devices operate at a rate much
lower than the figures you give.


Have you actually measured that?

--
To unsubscribe from this list: send the line unsubscribe linux-crypto in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: comparison of the AF_ALG interface with the /dev/crypto

2011-09-01 Thread Herbert Xu
On Thu, Sep 01, 2011 at 08:54:19AM +0200, Nikos Mavrogiannopoulos wrote:

 Have you actually measured that?

Not against your cryptodev code-base.

Cheers,
-- 
Email: Herbert Xu herb...@gondor.apana.org.au
Home Page: http://gondor.apana.org.au/~herbert/
PGP Key: http://gondor.apana.org.au/~herbert/pubkey.txt
--
To unsubscribe from this list: send the line unsubscribe linux-crypto in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: comparison of the AF_ALG interface with the /dev/crypto

2011-09-01 Thread Nikos Mavrogiannopoulos
On Thu, Sep 1, 2011 at 4:14 PM, Herbert Xu herb...@gondor.hengli.com.au wrote:

 Are you maxing out your submission CPU? If not then you're testing
 the latency of the interface, as opposed to the throughput.

I think it is obvious that a benchmark of throughput measures
throughput. If however, you think that AF_ALG is in disadvantage in
this benchmark, because it is a high latency interface, you're free to
propose and perform another one. I haven't seen anywhere how is this
interface was supposed to be used, nor about its qualities (high
latency, maybe(?) high throughput or so). Thus, I designed this
benchmark with a use-case in mind, i.e., a TLS or DTLS tunnel
executing in a system with such an accelerator. There might be other
benchmarks with other use cases in mind, but I haven't seen any.

regards,
Nikos
--
To unsubscribe from this list: send the line unsubscribe linux-crypto in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: comparison of the AF_ALG interface with the /dev/crypto

2011-09-01 Thread Nikos Mavrogiannopoulos
On Thu, Sep 1, 2011 at 4:59 PM, Herbert Xu herb...@gondor.hengli.com.au wrote:

 latency, maybe(?) high throughput or so). Thus, I designed this
 benchmark with a use-case in mind, i.e., a TLS or DTLS tunnel
 executing in a system with such an accelerator. There might be other
 benchmarks with other use cases in mind, but I haven't seen any.
 Putting TLS data-path in user-space is always going to be less
 than optimal, especially with hardware crypto offload, since you'll
 be crossing the user-space/kernel boundary multiple times.

Indeed but today that's what we have in some systems. User-space TLS
implementations (GnuTLS and OpenSSL) and kernel-space crypto
offloading. The purpose of the /dev/crypto and AF_ALG interfaces is to
connect those together. It would be interesting to have a partial
kernel-space TLS implementation but I don't know whether such a thing
could ever make it to kernel.

regards,
Nikos
--
To unsubscribe from this list: send the line unsubscribe linux-crypto in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: comparison of the AF_ALG interface with the /dev/crypto

2011-09-01 Thread Herbert Xu
On Thu, Sep 01, 2011 at 05:06:06PM +0200, Nikos Mavrogiannopoulos wrote:

 Indeed but today that's what we have in some systems. User-space TLS
 implementations (GnuTLS and OpenSSL) and kernel-space crypto
 offloading. The purpose of the /dev/crypto and AF_ALG interfaces is to
 connect those together. It would be interesting to have a partial
 kernel-space TLS implementation but I don't know whether such a thing
 could ever make it to kernel.

Well we've talked about a kernel implementation of the data path
previously and I don't think there is any opposition to the idea.

The only thing missing is an implementation.

Cheers,
-- 
Email: Herbert Xu herb...@gondor.apana.org.au
Home Page: http://gondor.apana.org.au/~herbert/
PGP Key: http://gondor.apana.org.au/~herbert/pubkey.txt
--
To unsubscribe from this list: send the line unsubscribe linux-crypto in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: comparison of the AF_ALG interface with the /dev/crypto

2011-09-01 Thread Phil Sutter
Herbert,

On Thu, Sep 01, 2011 at 10:14:45PM +0800, Herbert Xu wrote:
 Phil Sutter p...@nwl.cc wrote:
  
  chunksize   af_alg  cryptodev   (100 * cryptodev / af_alg)
  --
  512  4.169 MB/s  7.113 MB/s 171 %
  1024 7.904 MB/s 12.957 MB/s 164 %
  204813.163 MB/s 19.683 MB/s 150 %
  409620.218 MB/s 26.960 MB/s 133 %
  819227.539 MB/s 34.373 MB/s 125 %
  16384   33.730 MB/s 39.997 MB/s 119 %
  32768   37.399 MB/s 42.727 MB/s 114 %
  65536   40.004 MB/s 44.660 MB/s 112 %
 
 Are you maxing out your submission CPU? If not then you're testing
 the latency of the interface, as opposed to the throughput.

Good point. So in order to also test the throughput, I've put my OpenRD
under load:

| stress -c 2 -i 2 -m 2 --vm-bytes 64MB

and ran the tests again:

chunksize   af_alg  cryptodev   (100 * cryptodev / af_alg)
--
512  0.618 MB/s  1.14 MB/s  184 %
1024 1.258 MB/s  2.28 MB/s  181 %
2048 2.453 MB/s  4.39 MB/s  179 %
4096 4.540 MB/s  7.76 MB/s  171 %
8192 7.981 MB/s 11.67 MB/s  146 %
16384   12.543 MB/s 14.08 MB/s  112 %
32768   13.139 MB/s 14.46 MB/s  110 %
65536   14.254 MB/s 15.55 MB/s  109 %

So that means cryptodev-linux is superior in throughput as well as
latency, right? Or is it the lower latency of the interface causing the
higher throughput?

Greetings, Phil
--
To unsubscribe from this list: send the line unsubscribe linux-crypto in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: comparison of the AF_ALG interface with the /dev/crypto

2011-09-01 Thread Herbert Xu
On Thu, Sep 01, 2011 at 05:09:28PM +0200, Phil Sutter wrote:

 Good point. So in order to also test the throughput, I've put my OpenRD
 under load:

No that's not what I meant.  You're pushing a request to an
async device and waiting for a response to come back before
pushing the next request.  In order to maximise throughput,
you need to issue your requests without waiting for the responses
synchronously.

Cheers,
-- 
Email: Herbert Xu herb...@gondor.apana.org.au
Home Page: http://gondor.apana.org.au/~herbert/
PGP Key: http://gondor.apana.org.au/~herbert/pubkey.txt
--
To unsubscribe from this list: send the line unsubscribe linux-crypto in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: comparison of the AF_ALG interface with the /dev/crypto

2011-09-01 Thread David Miller
From: Nikos Mavrogiannopoulos n...@gnutls.org
Date: Thu, 1 Sep 2011 17:06:06 +0200

 It would be interesting to have a partial kernel-space TLS
 implementation but I don't know whether such a thing could ever make
 it to kernel.

Herbert and I have discussed this several times and we plan on
implementing this at some point.
--
To unsubscribe from this list: send the line unsubscribe linux-crypto in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: comparison of the AF_ALG interface with the /dev/crypto

2011-09-01 Thread Nikos Mavrogiannopoulos

On 09/01/2011 05:32 PM, David Miller wrote:

From: Nikos Mavrogiannopoulosn...@gnutls.org
Date: Thu, 1 Sep 2011 17:06:06 +0200


It would be interesting to have a partial kernel-space TLS
implementation but I don't know whether such a thing could ever make
it to kernel.

Herbert and I have discussed this several times and we plan on
implementing this at some point.


The problem is that TLS is not a universal thing. There is still SSH,
kerberos, openvpn (as far as I remember it is a custom protocol), etc. 
It makes sense to have something to apply broadly, especially when it is 
in the Linux kernel. Currently have a device such as /dev/crypto looks 
like a good compromise.


regards,
Nikos
--
To unsubscribe from this list: send the line unsubscribe linux-crypto in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: comparison of the AF_ALG interface with the /dev/crypto

2011-08-31 Thread Herbert Xu
Nikos Mavrogiannopoulos n...@gnutls.org wrote:

 Given my benchmarks have no issues, it is not apparent to me why one
 should use AF_ALG instead of cryptodev. I do not know though why AF_ALG
 performs so poor. I'd speculate by blaming it on the usage of the socket
 API and the number of system calls required.

The target usage of AF_ALG is hardware offload devices that cannot
be directly used in user-space, not software crypto on implementations
such as AESNI/Padlock.

Going through the kernel to use something like AESNI/Padlock or
software crypto is insane.

Given the intended target case, your numbers are pretty much
meaningless as cryptodev's performance can be easily beaten
by a pure user-space implementation.

Cheers,
-- 
Email: Herbert Xu herb...@gondor.apana.org.au
Home Page: http://gondor.apana.org.au/~herbert/
PGP Key: http://gondor.apana.org.au/~herbert/pubkey.txt
--
To unsubscribe from this list: send the line unsubscribe linux-crypto in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: comparison of the AF_ALG interface with the /dev/crypto

2011-08-29 Thread Nikos Mavrogiannopoulos

On 08/28/2011 10:35 PM, David Miller wrote:


The benchmark idea was to test the speed of initialization, encryption
and deinitiation, as well as the encryption speed alone. These are the
most common use cases of the frameworks (i.e. how they would be used
by a cryptographic library).

Be sure to use splice() with AF_ALG for maximum performance.
For example, see the test program below.  You'll need to replace
8192 with whatever the page size is on your cpu.


As I understand with splice you can encrypt only page aligned data that 
span a multiple of pages. This is a very uncommon case. My benchmark 
targets the generic case, i.e., the way this interface will be used in 
crypto libraries like gnutls.


However, I'll update the comparison page to include the splice version 
as well.


regards,
Nikos
--
To unsubscribe from this list: send the line unsubscribe linux-crypto in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: comparison of the AF_ALG interface with the /dev/crypto

2011-08-29 Thread David Miller
From: Nikos Mavrogiannopoulos n...@gnutls.org
Date: Mon, 29 Aug 2011 09:32:19 +0200

 On 08/28/2011 10:35 PM, David Miller wrote:
 
 The benchmark idea was to test the speed of initialization, encryption
 and deinitiation, as well as the encryption speed alone. These are the
 most common use cases of the frameworks (i.e. how they would be used
 by a cryptographic library).
 Be sure to use splice() with AF_ALG for maximum performance.
 For example, see the test program below.  You'll need to replace
 8192 with whatever the page size is on your cpu.
 
 As I understand with splice you can encrypt only page aligned data
 that span a multiple of pages. This is a very uncommon case. My
 benchmark targets the generic case, i.e., the way this interface will
 be used in crypto libraries like gnutls.

Only the buffer you use must have these properties, you can use
whatever lengths you like.
--
To unsubscribe from this list: send the line unsubscribe linux-crypto in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: comparison of the AF_ALG interface with the /dev/crypto

2011-08-28 Thread David Miller
From: Nikos Mavrogiannopoulos n...@gnutls.org
Date: Sun, 28 Aug 2011 15:17:00 +0200

 The benchmark idea was to test the speed of initialization, encryption
 and deinitiation, as well as the encryption speed alone. These are the
 most common use cases of the frameworks (i.e. how they would be used
 by a cryptographic library).

Be sure to use splice() with AF_ALG for maximum performance.

For example, see the test program below.  You'll need to replace
8192 with whatever the page size is on your cpu.


#include fcntl.h
#include openssl/aes.h
#include stdio.h
#include string.h
#include sys/socket.h
#include sys/types.h
#include linux/types.h

#define AF_ALG 38
#define SOL_ALG 279

#define SPLICE_F_GIFT   (0x08)  /* pages passed in are a gift */

struct sockaddr_alg {
__u16   salg_family;
__u8salg_type[14];
__u32   salg_feat;
__u32   salg_mask;
__u8salg_name[64];
};

struct af_alg_iv {
__u32   ivlen;
__u8iv[0];
};

/* Socket options */
#define ALG_SET_KEY 1
#define ALG_SET_IV  2
#define ALG_SET_OP  3

/* Operations */
#define ALG_OP_DECRYPT  0
#define ALG_OP_ENCRYPT  1

static char buf[8192] __attribute__((__aligned__(8192)));

static void crypt_ssl(const char *key, char *iv, int i)
{
AES_KEY akey;

AES_set_encrypt_key(key, 128, akey);

while (i--)
AES_cbc_encrypt(buf, buf, 8192, akey, iv, 1);
}

static void crypt_kernel(const char *key, char *oiv, int i)
{
int opfd;
int tfmfd;
struct sockaddr_alg sa = {
.salg_family = AF_ALG,
.salg_type = skcipher,
.salg_name = cbc(aes)
};
struct msghdr msg = {};
struct cmsghdr *cmsg;
char cbuf[CMSG_SPACE(4) + CMSG_SPACE(20)] = {};
struct aes_iv {
__u32 len;
__u8 iv[16];
} *iv;
struct iovec iov;
int pipes[2];

pipe(pipes);

tfmfd = socket(AF_ALG, SOCK_SEQPACKET, 0);

bind(tfmfd, (struct sockaddr *)sa, sizeof(sa));

setsockopt(tfmfd, SOL_ALG, ALG_SET_KEY, key, 16);

opfd = accept(tfmfd, NULL, 0);

msg.msg_control = cbuf;
msg.msg_controllen = sizeof(cbuf);

cmsg = CMSG_FIRSTHDR(msg);
cmsg-cmsg_level = SOL_ALG;
cmsg-cmsg_type = ALG_SET_OP;
cmsg-cmsg_len = CMSG_LEN(4);
*(__u32 *)CMSG_DATA(cmsg) = ALG_OP_ENCRYPT;

cmsg = CMSG_NXTHDR(msg, cmsg);
cmsg-cmsg_level = SOL_ALG;
cmsg-cmsg_type = ALG_SET_IV;
cmsg-cmsg_len = CMSG_LEN(20);
iv = (void *)CMSG_DATA(cmsg);
iv-len = 16;
memcpy(iv-iv, oiv, 16);

iov.iov_base = buf;
iov.iov_len = 8192;

msg.msg_iovlen = 0;
msg.msg_flags = MSG_MORE;

while (i--) {
sendmsg(opfd, msg, 0);
vmsplice(pipes[1], iov, 1, SPLICE_F_GIFT);
splice(pipes[0], NULL, opfd, NULL, 8192, 0);
read(opfd, buf, 8192);
}

close(opfd);
close(tfmfd);
close(pipes[0]);
close(pipes[1]);
}

int main(int argc, char **argv)
{
int i;

const char key[16] =
\x06\xa9\x21\x40\x36\xb8\xa1\x5b
\x51\x2e\x03\xd5\x34\x12\x00\x06;
char iv[16] = 
\x3d\xaf\xba\x42\x9d\x9e\xb4\x30
\xb4\x22\xda\x80\x2c\x9f\xac\x41;

memcpy(buf, Single block msg, 16);

if (argc  1)
crypt_ssl(key, iv, 1024 * 1024);
else
crypt_kernel(key, iv, 1024 * 1024);

for (i = 0; i  8192; i++) {
printf(%02x, (unsigned char)buf[i]);
}
printf(\n);

return 0;
}
--
To unsubscribe from this list: send the line unsubscribe linux-crypto in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html