Re: comparison of the AF_ALG interface with the /dev/crypto
On 09/01/2011 04:15 AM, Herbert Xu wrote: Nikos Mavrogiannopoulosn...@gnutls.org wrote: Given my benchmarks have no issues, it is not apparent to me why one should use AF_ALG instead of cryptodev. I do not know though why AF_ALG performs so poor. I'd speculate by blaming it on the usage of the socket API and the number of system calls required. The target usage of AF_ALG is hardware offload devices that cannot be directly used in user-space, not software crypto on implementations such as AESNI/Padlock. Going through the kernel to use something like AESNI/Padlock or software crypto is insane. Given the intended target case, your numbers are pretty much meaningless as cryptodev's performance can be easily beaten by a pure user-space implementation. Actually this is the reason of the ecb(cipher-null) comparison. To emulate the case of a hardware offload device. I tried to make that clear in the text, but may not be. If you see AF_ALG performs really bad on that case. It performs better when a software or a padlock implementation of AES is involved (which as you say it is a useless use-case). Of course, I don't own such an offloading device and cannot test it directly. If you have different values from a benchmark with an actual hardware accelerator, I'll be happy to include them. regards, Nikos -- To unsubscribe from this list: send the line unsubscribe linux-crypto in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: comparison of the AF_ALG interface with the /dev/crypto
On Thu, Sep 01, 2011 at 08:26:07AM +0200, Nikos Mavrogiannopoulos wrote: Actually this is the reason of the ecb(cipher-null) comparison. To emulate the case of a hardware offload device. I tried to make that clear in the text, but may not be. If you see AF_ALG performs really bad on that case. It performs better when a software or a padlock implementation of AES is involved (which as you say it is a useless use-case). It's meaningless because such devices operate at a rate much lower than the figures you give. Cheers, -- Email: Herbert Xu herb...@gondor.apana.org.au Home Page: http://gondor.apana.org.au/~herbert/ PGP Key: http://gondor.apana.org.au/~herbert/pubkey.txt -- To unsubscribe from this list: send the line unsubscribe linux-crypto in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: comparison of the AF_ALG interface with the /dev/crypto
On 09/01/2011 08:43 AM, Herbert Xu wrote: On Thu, Sep 01, 2011 at 08:26:07AM +0200, Nikos Mavrogiannopoulos wrote: Actually this is the reason of the ecb(cipher-null) comparison. To emulate the case of a hardware offload device. I tried to make that clear in the text, but may not be. If you see AF_ALG performs really bad on that case. It performs better when a software or a padlock implementation of AES is involved (which as you say it is a useless use-case). It's meaningless because such devices operate at a rate much lower than the figures you give. Have you actually measured that? -- To unsubscribe from this list: send the line unsubscribe linux-crypto in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: comparison of the AF_ALG interface with the /dev/crypto
On Thu, Sep 01, 2011 at 08:54:19AM +0200, Nikos Mavrogiannopoulos wrote: Have you actually measured that? Not against your cryptodev code-base. Cheers, -- Email: Herbert Xu herb...@gondor.apana.org.au Home Page: http://gondor.apana.org.au/~herbert/ PGP Key: http://gondor.apana.org.au/~herbert/pubkey.txt -- To unsubscribe from this list: send the line unsubscribe linux-crypto in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: comparison of the AF_ALG interface with the /dev/crypto
On Thu, Sep 1, 2011 at 4:14 PM, Herbert Xu herb...@gondor.hengli.com.au wrote: Are you maxing out your submission CPU? If not then you're testing the latency of the interface, as opposed to the throughput. I think it is obvious that a benchmark of throughput measures throughput. If however, you think that AF_ALG is in disadvantage in this benchmark, because it is a high latency interface, you're free to propose and perform another one. I haven't seen anywhere how is this interface was supposed to be used, nor about its qualities (high latency, maybe(?) high throughput or so). Thus, I designed this benchmark with a use-case in mind, i.e., a TLS or DTLS tunnel executing in a system with such an accelerator. There might be other benchmarks with other use cases in mind, but I haven't seen any. regards, Nikos -- To unsubscribe from this list: send the line unsubscribe linux-crypto in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: comparison of the AF_ALG interface with the /dev/crypto
On Thu, Sep 1, 2011 at 4:59 PM, Herbert Xu herb...@gondor.hengli.com.au wrote: latency, maybe(?) high throughput or so). Thus, I designed this benchmark with a use-case in mind, i.e., a TLS or DTLS tunnel executing in a system with such an accelerator. There might be other benchmarks with other use cases in mind, but I haven't seen any. Putting TLS data-path in user-space is always going to be less than optimal, especially with hardware crypto offload, since you'll be crossing the user-space/kernel boundary multiple times. Indeed but today that's what we have in some systems. User-space TLS implementations (GnuTLS and OpenSSL) and kernel-space crypto offloading. The purpose of the /dev/crypto and AF_ALG interfaces is to connect those together. It would be interesting to have a partial kernel-space TLS implementation but I don't know whether such a thing could ever make it to kernel. regards, Nikos -- To unsubscribe from this list: send the line unsubscribe linux-crypto in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: comparison of the AF_ALG interface with the /dev/crypto
On Thu, Sep 01, 2011 at 05:06:06PM +0200, Nikos Mavrogiannopoulos wrote: Indeed but today that's what we have in some systems. User-space TLS implementations (GnuTLS and OpenSSL) and kernel-space crypto offloading. The purpose of the /dev/crypto and AF_ALG interfaces is to connect those together. It would be interesting to have a partial kernel-space TLS implementation but I don't know whether such a thing could ever make it to kernel. Well we've talked about a kernel implementation of the data path previously and I don't think there is any opposition to the idea. The only thing missing is an implementation. Cheers, -- Email: Herbert Xu herb...@gondor.apana.org.au Home Page: http://gondor.apana.org.au/~herbert/ PGP Key: http://gondor.apana.org.au/~herbert/pubkey.txt -- To unsubscribe from this list: send the line unsubscribe linux-crypto in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: comparison of the AF_ALG interface with the /dev/crypto
Herbert, On Thu, Sep 01, 2011 at 10:14:45PM +0800, Herbert Xu wrote: Phil Sutter p...@nwl.cc wrote: chunksize af_alg cryptodev (100 * cryptodev / af_alg) -- 512 4.169 MB/s 7.113 MB/s 171 % 1024 7.904 MB/s 12.957 MB/s 164 % 204813.163 MB/s 19.683 MB/s 150 % 409620.218 MB/s 26.960 MB/s 133 % 819227.539 MB/s 34.373 MB/s 125 % 16384 33.730 MB/s 39.997 MB/s 119 % 32768 37.399 MB/s 42.727 MB/s 114 % 65536 40.004 MB/s 44.660 MB/s 112 % Are you maxing out your submission CPU? If not then you're testing the latency of the interface, as opposed to the throughput. Good point. So in order to also test the throughput, I've put my OpenRD under load: | stress -c 2 -i 2 -m 2 --vm-bytes 64MB and ran the tests again: chunksize af_alg cryptodev (100 * cryptodev / af_alg) -- 512 0.618 MB/s 1.14 MB/s 184 % 1024 1.258 MB/s 2.28 MB/s 181 % 2048 2.453 MB/s 4.39 MB/s 179 % 4096 4.540 MB/s 7.76 MB/s 171 % 8192 7.981 MB/s 11.67 MB/s 146 % 16384 12.543 MB/s 14.08 MB/s 112 % 32768 13.139 MB/s 14.46 MB/s 110 % 65536 14.254 MB/s 15.55 MB/s 109 % So that means cryptodev-linux is superior in throughput as well as latency, right? Or is it the lower latency of the interface causing the higher throughput? Greetings, Phil -- To unsubscribe from this list: send the line unsubscribe linux-crypto in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: comparison of the AF_ALG interface with the /dev/crypto
On Thu, Sep 01, 2011 at 05:09:28PM +0200, Phil Sutter wrote: Good point. So in order to also test the throughput, I've put my OpenRD under load: No that's not what I meant. You're pushing a request to an async device and waiting for a response to come back before pushing the next request. In order to maximise throughput, you need to issue your requests without waiting for the responses synchronously. Cheers, -- Email: Herbert Xu herb...@gondor.apana.org.au Home Page: http://gondor.apana.org.au/~herbert/ PGP Key: http://gondor.apana.org.au/~herbert/pubkey.txt -- To unsubscribe from this list: send the line unsubscribe linux-crypto in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: comparison of the AF_ALG interface with the /dev/crypto
From: Nikos Mavrogiannopoulos n...@gnutls.org Date: Thu, 1 Sep 2011 17:06:06 +0200 It would be interesting to have a partial kernel-space TLS implementation but I don't know whether such a thing could ever make it to kernel. Herbert and I have discussed this several times and we plan on implementing this at some point. -- To unsubscribe from this list: send the line unsubscribe linux-crypto in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: comparison of the AF_ALG interface with the /dev/crypto
On 09/01/2011 05:32 PM, David Miller wrote: From: Nikos Mavrogiannopoulosn...@gnutls.org Date: Thu, 1 Sep 2011 17:06:06 +0200 It would be interesting to have a partial kernel-space TLS implementation but I don't know whether such a thing could ever make it to kernel. Herbert and I have discussed this several times and we plan on implementing this at some point. The problem is that TLS is not a universal thing. There is still SSH, kerberos, openvpn (as far as I remember it is a custom protocol), etc. It makes sense to have something to apply broadly, especially when it is in the Linux kernel. Currently have a device such as /dev/crypto looks like a good compromise. regards, Nikos -- To unsubscribe from this list: send the line unsubscribe linux-crypto in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: comparison of the AF_ALG interface with the /dev/crypto
Nikos Mavrogiannopoulos n...@gnutls.org wrote: Given my benchmarks have no issues, it is not apparent to me why one should use AF_ALG instead of cryptodev. I do not know though why AF_ALG performs so poor. I'd speculate by blaming it on the usage of the socket API and the number of system calls required. The target usage of AF_ALG is hardware offload devices that cannot be directly used in user-space, not software crypto on implementations such as AESNI/Padlock. Going through the kernel to use something like AESNI/Padlock or software crypto is insane. Given the intended target case, your numbers are pretty much meaningless as cryptodev's performance can be easily beaten by a pure user-space implementation. Cheers, -- Email: Herbert Xu herb...@gondor.apana.org.au Home Page: http://gondor.apana.org.au/~herbert/ PGP Key: http://gondor.apana.org.au/~herbert/pubkey.txt -- To unsubscribe from this list: send the line unsubscribe linux-crypto in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: comparison of the AF_ALG interface with the /dev/crypto
On 08/28/2011 10:35 PM, David Miller wrote: The benchmark idea was to test the speed of initialization, encryption and deinitiation, as well as the encryption speed alone. These are the most common use cases of the frameworks (i.e. how they would be used by a cryptographic library). Be sure to use splice() with AF_ALG for maximum performance. For example, see the test program below. You'll need to replace 8192 with whatever the page size is on your cpu. As I understand with splice you can encrypt only page aligned data that span a multiple of pages. This is a very uncommon case. My benchmark targets the generic case, i.e., the way this interface will be used in crypto libraries like gnutls. However, I'll update the comparison page to include the splice version as well. regards, Nikos -- To unsubscribe from this list: send the line unsubscribe linux-crypto in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: comparison of the AF_ALG interface with the /dev/crypto
From: Nikos Mavrogiannopoulos n...@gnutls.org Date: Mon, 29 Aug 2011 09:32:19 +0200 On 08/28/2011 10:35 PM, David Miller wrote: The benchmark idea was to test the speed of initialization, encryption and deinitiation, as well as the encryption speed alone. These are the most common use cases of the frameworks (i.e. how they would be used by a cryptographic library). Be sure to use splice() with AF_ALG for maximum performance. For example, see the test program below. You'll need to replace 8192 with whatever the page size is on your cpu. As I understand with splice you can encrypt only page aligned data that span a multiple of pages. This is a very uncommon case. My benchmark targets the generic case, i.e., the way this interface will be used in crypto libraries like gnutls. Only the buffer you use must have these properties, you can use whatever lengths you like. -- To unsubscribe from this list: send the line unsubscribe linux-crypto in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: comparison of the AF_ALG interface with the /dev/crypto
From: Nikos Mavrogiannopoulos n...@gnutls.org Date: Sun, 28 Aug 2011 15:17:00 +0200 The benchmark idea was to test the speed of initialization, encryption and deinitiation, as well as the encryption speed alone. These are the most common use cases of the frameworks (i.e. how they would be used by a cryptographic library). Be sure to use splice() with AF_ALG for maximum performance. For example, see the test program below. You'll need to replace 8192 with whatever the page size is on your cpu. #include fcntl.h #include openssl/aes.h #include stdio.h #include string.h #include sys/socket.h #include sys/types.h #include linux/types.h #define AF_ALG 38 #define SOL_ALG 279 #define SPLICE_F_GIFT (0x08) /* pages passed in are a gift */ struct sockaddr_alg { __u16 salg_family; __u8salg_type[14]; __u32 salg_feat; __u32 salg_mask; __u8salg_name[64]; }; struct af_alg_iv { __u32 ivlen; __u8iv[0]; }; /* Socket options */ #define ALG_SET_KEY 1 #define ALG_SET_IV 2 #define ALG_SET_OP 3 /* Operations */ #define ALG_OP_DECRYPT 0 #define ALG_OP_ENCRYPT 1 static char buf[8192] __attribute__((__aligned__(8192))); static void crypt_ssl(const char *key, char *iv, int i) { AES_KEY akey; AES_set_encrypt_key(key, 128, akey); while (i--) AES_cbc_encrypt(buf, buf, 8192, akey, iv, 1); } static void crypt_kernel(const char *key, char *oiv, int i) { int opfd; int tfmfd; struct sockaddr_alg sa = { .salg_family = AF_ALG, .salg_type = skcipher, .salg_name = cbc(aes) }; struct msghdr msg = {}; struct cmsghdr *cmsg; char cbuf[CMSG_SPACE(4) + CMSG_SPACE(20)] = {}; struct aes_iv { __u32 len; __u8 iv[16]; } *iv; struct iovec iov; int pipes[2]; pipe(pipes); tfmfd = socket(AF_ALG, SOCK_SEQPACKET, 0); bind(tfmfd, (struct sockaddr *)sa, sizeof(sa)); setsockopt(tfmfd, SOL_ALG, ALG_SET_KEY, key, 16); opfd = accept(tfmfd, NULL, 0); msg.msg_control = cbuf; msg.msg_controllen = sizeof(cbuf); cmsg = CMSG_FIRSTHDR(msg); cmsg-cmsg_level = SOL_ALG; cmsg-cmsg_type = ALG_SET_OP; cmsg-cmsg_len = CMSG_LEN(4); *(__u32 *)CMSG_DATA(cmsg) = ALG_OP_ENCRYPT; cmsg = CMSG_NXTHDR(msg, cmsg); cmsg-cmsg_level = SOL_ALG; cmsg-cmsg_type = ALG_SET_IV; cmsg-cmsg_len = CMSG_LEN(20); iv = (void *)CMSG_DATA(cmsg); iv-len = 16; memcpy(iv-iv, oiv, 16); iov.iov_base = buf; iov.iov_len = 8192; msg.msg_iovlen = 0; msg.msg_flags = MSG_MORE; while (i--) { sendmsg(opfd, msg, 0); vmsplice(pipes[1], iov, 1, SPLICE_F_GIFT); splice(pipes[0], NULL, opfd, NULL, 8192, 0); read(opfd, buf, 8192); } close(opfd); close(tfmfd); close(pipes[0]); close(pipes[1]); } int main(int argc, char **argv) { int i; const char key[16] = \x06\xa9\x21\x40\x36\xb8\xa1\x5b \x51\x2e\x03\xd5\x34\x12\x00\x06; char iv[16] = \x3d\xaf\xba\x42\x9d\x9e\xb4\x30 \xb4\x22\xda\x80\x2c\x9f\xac\x41; memcpy(buf, Single block msg, 16); if (argc 1) crypt_ssl(key, iv, 1024 * 1024); else crypt_kernel(key, iv, 1024 * 1024); for (i = 0; i 8192; i++) { printf(%02x, (unsigned char)buf[i]); } printf(\n); return 0; } -- To unsubscribe from this list: send the line unsubscribe linux-crypto in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html