Re: patch to improve AES-NI performance
On Sun, Aug 25, 2013 at 10:19 AM, Ollivier Robert robe...@keltia.freenix.fr wrote: According to Ollivier Robert: You are right, I wanted to say r226837 which is the code one. FYI I've finally merged r226837,r226839 as r254856 in stable/9 as it is a prerequesite to apply jmg's patch. I've asked re@ whether they would consider this for 9.2. It is very late in the 9.2 release circle but that patch has been in 10 for more than a year now... So this patch can now be applied to STABLE/9 for testing on a local system since these 3 revisions are now in?? -- Ollivier ROBERT -=- FreeBSD: The Power to Serve! -=- robe...@keltia.freenix.fr In memoriam to Ondine : http://ondine.keltia.net/ ___ freebsd-current@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-current To unsubscribe, send any mail to freebsd-current-unsubscr...@freebsd.org ___ freebsd-current@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-current To unsubscribe, send any mail to freebsd-current-unsubscr...@freebsd.org
Re: patch to improve AES-NI performance
Outback Dingo wrote this message on Tue, Aug 27, 2013 at 19:01 -0400: On Sun, Aug 25, 2013 at 10:19 AM, Ollivier Robert robe...@keltia.freenix.fr wrote: According to Ollivier Robert: You are right, I wanted to say r226837 which is the code one. FYI I've finally merged r226837,r226839 as r254856 in stable/9 as it is a prerequesite to apply jmg's patch. I've asked re@ whether they would consider this for 9.2. It is very late in the 9.2 release circle but that patch has been in 10 for more than a year now... So this patch can now be applied to STABLE/9 for testing on a local system since these 3 revisions are now in?? I've compile tested the patch, but not run tested it: https://people.freebsd.org/~jmg/aesni.9stable.patch Note that 9stable uses gcc by default and this patch required clang, so make sure you set WITH_CLANG=YES WITH_CLANG_IS_CC=YES, otherwise it won't compile... This patch only required minor changes from my original patch to apply.. -- John-Mark Gurney Voice: +1 415 225 5579 All that I will do, has been done, All that I have, has not. ___ freebsd-current@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-current To unsubscribe, send any mail to freebsd-current-unsubscr...@freebsd.org
Re: patch to improve AES-NI performance
According to Ollivier Robert: You are right, I wanted to say r226837 which is the code one. FYI I've finally merged r226837,r226839 as r254856 in stable/9 as it is a prerequesite to apply jmg's patch. I've asked re@ whether they would consider this for 9.2. It is very late in the 9.2 release circle but that patch has been in 10 for more than a year now... -- Ollivier ROBERT -=- FreeBSD: The Power to Serve! -=- robe...@keltia.freenix.fr In memoriam to Ondine : http://ondine.keltia.net/ ___ freebsd-current@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-current To unsubscribe, send any mail to freebsd-current-unsubscr...@freebsd.org
Re: patch to improve AES-NI performance
According to John-Mark Gurney on Thu, Aug 22, 2013 at 01:20:27PM -0700: I have developed a patch to improve AES-NI performance. If you took the AES-XTS algorithm into userland (no cryptodev or geli usage), these changes improve the performance over 10x in my tests (from ~150MB/sec to over 2GB/sec). In tests of geli on gnop, the performance improvement is more moderate, around 4x due to overhead in other parts of the system. Thanks a lot for this patch. Now, looking at it in the stable/9 context, I can see that pjd did not merge (as he said at the time of commit) r226839 r226839. Is there any objection to merge these two (and possibly 247061 as well -- copyright update)? I ask that for two reasons, these two revisions are speeding up AES-NI quite a bit and they are required for using jmg's patch. I'll be testing all this in the next few days on my new AES-NI enabled machine. -- Ollivier ROBERT -=- FreeBSD: The Power to Serve! -=- robe...@keltia.net In memoriam to Ondine, our 2nd child: http://ondine.keltia.net/ ___ freebsd-current@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-current To unsubscribe, send any mail to freebsd-current-unsubscr...@freebsd.org
Re: patch to improve AES-NI performance
On 8/23/2013 11:16 AM, Ollivier Robert wrote: According to John-Mark Gurney on Thu, Aug 22, 2013 at 01:20:27PM -0700: I have developed a patch to improve AES-NI performance. If you took the AES-XTS algorithm into userland (no cryptodev or geli usage), these changes improve the performance over 10x in my tests (from ~150MB/sec to over 2GB/sec). In tests of geli on gnop, the performance improvement is more moderate, around 4x due to overhead in other parts of the system. Thanks a lot for this patch. Now, looking at it in the stable/9 context, I can see that pjd did not merge (as he said at the time of commit) r226839 r226839. Is there any objection to merge these two (and possibly 247061 as well -- copyright update)? I ask that for two reasons, these two revisions are speeding up AES-NI quite a bit and they are required for using jmg's patch. I'll be testing all this in the next few days on my new AES-NI enabled machine. Speeding up userland AES is very interesting to me for a couple of apps. If there is a proper way I should test on RELENG_9, please let me know as I am few boxes that I would be happy to test/deploy on. ---Mike -- --- Mike Tancsa, tel +1 519 651 3400 Sentex Communications, m...@sentex.net Providing Internet services since 1994 www.sentex.net Cambridge, Ontario Canada http://www.tancsa.com/ ___ freebsd-current@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-current To unsubscribe, send any mail to freebsd-current-unsubscr...@freebsd.org
Re: patch to improve AES-NI performance
Ollivier Robert wrote this message on Fri, Aug 23, 2013 at 17:16 +0200: According to John-Mark Gurney on Thu, Aug 22, 2013 at 01:20:27PM -0700: I have developed a patch to improve AES-NI performance. If you took the AES-XTS algorithm into userland (no cryptodev or geli usage), these changes improve the performance over 10x in my tests (from ~150MB/sec to over 2GB/sec). In tests of geli on gnop, the performance improvement is more moderate, around 4x due to overhead in other parts of the system. Thanks a lot for this patch. Now, looking at it in the stable/9 context, I can see that pjd did not merge (as he said at the time of commit) r226839 r226839. Is there any objection to merge these two (and possibly 247061 as well -- copyright update)? You repeated r226839 twice. What is the correct second revision? And both the ones above are just copyright updates, no functionality changes... I ask that for two reasons, these two revisions are speeding up AES-NI quite a bit and they are required for using jmg's patch. I'll be testing all this in the next few days on my new AES-NI enabled machine. -- John-Mark Gurney Voice: +1 415 225 5579 All that I will do, has been done, All that I have, has not. ___ freebsd-current@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-current To unsubscribe, send any mail to freebsd-current-unsubscr...@freebsd.org
Re: patch to improve AES-NI performance
Mike Tancsa wrote this message on Fri, Aug 23, 2013 at 11:26 -0400: On 8/23/2013 11:16 AM, Ollivier Robert wrote: According to John-Mark Gurney on Thu, Aug 22, 2013 at 01:20:27PM -0700: I have developed a patch to improve AES-NI performance. If you took the AES-XTS algorithm into userland (no cryptodev or geli usage), these changes improve the performance over 10x in my tests (from ~150MB/sec to over 2GB/sec). In tests of geli on gnop, the performance improvement is more moderate, around 4x due to overhead in other parts of the system. Thanks a lot for this patch. Now, looking at it in the stable/9 context, I can see that pjd did not merge (as he said at the time of commit) r226839 r226839. Is there any objection to merge these two (and possibly 247061 as well -- copyright update)? I ask that for two reasons, these two revisions are speeding up AES-NI quite a bit and they are required for using jmg's patch. I'll be testing all this in the next few days on my new AES-NI enabled machine. Speeding up userland AES is very interesting to me for a couple of apps. If there is a proper way I should test on RELENG_9, please let me know as I am few boxes that I would be happy to test/deploy on. My patch would only effect userland applications that use /dev/crypto... If they do their own AES-NI work, then there isn't any improvement... -- John-Mark Gurney Voice: +1 415 225 5579 All that I will do, has been done, All that I have, has not. ___ freebsd-current@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-current To unsubscribe, send any mail to freebsd-current-unsubscr...@freebsd.org
Re: patch to improve AES-NI performance
On 8/23/2013 2:05 PM, John-Mark Gurney wrote: Speeding up userland AES is very interesting to me for a couple of apps. If there is a proper way I should test on RELENG_9, please let me know as I am few boxes that I would be happy to test/deploy on. My patch would only effect userland applications that use /dev/crypto... If they do their own AES-NI work, then there isn't any improvement... For me its ssh which I think does, no ? ---Mike -- --- Mike Tancsa, tel +1 519 651 3400 Sentex Communications, m...@sentex.net Providing Internet services since 1994 www.sentex.net Cambridge, Ontario Canada http://www.tancsa.com/ ___ freebsd-current@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-current To unsubscribe, send any mail to freebsd-current-unsubscr...@freebsd.org
Re: patch to improve AES-NI performance
According to John-Mark Gurney: pjd did not merge (as he said at the time of commit) r226839 r226839. Is there any objection to merge these two (and possibly 247061 as well -- copyright update)? You repeated r226839 twice. What is the correct second revision? You are right, I wanted to say r226837 which is the code one. -- Ollivier ROBERT -=- FreeBSD: The Power to Serve! -=- robe...@keltia.freenix.fr In memoriam to Ondine : http://ondine.keltia.net/ ___ freebsd-current@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-current To unsubscribe, send any mail to freebsd-current-unsubscr...@freebsd.org
Re: patch to improve AES-NI performance
Mike Tancsa wrote this message on Fri, Aug 23, 2013 at 14:19 -0400: On 8/23/2013 2:05 PM, John-Mark Gurney wrote: Speeding up userland AES is very interesting to me for a couple of apps. If there is a proper way I should test on RELENG_9, please let me know as I am few boxes that I would be happy to test/deploy on. My patch would only effect userland applications that use /dev/crypto... If they do their own AES-NI work, then there isn't any improvement... For me its ssh which I think does, no ? It looks like it uses OpenSSL for it's crypto, not /dev/crypto... Also, my work was done improving AES-XTS which isn't used by OpenSSH... OpenSSH looks like it uses either AES-GCM or AES-CTR, neither of which are supported by /dev/crypto... My gcc patch does include PCLMULQDQ support, which will be helpful for improving the performance of AES-GCM, and it looks like OpenSSL 1.0.1 has support, which is in HEAD, not RELENG_9 yet... So, if you want better ssh performance, install OpenSSL 1.0.1 and compile OpenSSH against it... -- John-Mark Gurney Voice: +1 415 225 5579 All that I will do, has been done, All that I have, has not. ___ freebsd-current@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-current To unsubscribe, send any mail to freebsd-current-unsubscr...@freebsd.org
Re: patch to improve AES-NI performance
John-Mark Gurney j...@funkthat.com writes: Mike Tancsa m...@sentex.net writes: John-Mark Gurney j...@funkthat.com writes: My patch would only effect userland applications that use /dev/crypto... For me its ssh which I think does, no ? It looks like it uses OpenSSL for it's crypto, not /dev/crypto... It uses OpenSSL engines, which use /dev/crypto. This is why we had to turn off sandbox mode - a CRIOGET ioctl fails because the sandbox code sets RLIMIT_NOFILES to 0. (trimming security@ from the cc: list as it's an alias for secteam@ which is not the appropriate venue for this discussion.) DES -- Dag-Erling Smørgrav - d...@des.no ___ freebsd-current@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-current To unsubscribe, send any mail to freebsd-current-unsubscr...@freebsd.org
Re: patch to improve AES-NI performance
Dag-Erling Smrgrav wrote this message on Fri, Aug 23, 2013 at 21:30 +0200: John-Mark Gurney j...@funkthat.com writes: Mike Tancsa m...@sentex.net writes: John-Mark Gurney j...@funkthat.com writes: My patch would only effect userland applications that use /dev/crypto... For me its ssh which I think does, no ? It looks like it uses OpenSSL for it's crypto, not /dev/crypto... It uses OpenSSL engines, which use /dev/crypto. This is why we had to turn off sandbox mode - a CRIOGET ioctl fails because the sandbox code sets RLIMIT_NOFILES to 0. If it does use /dev/crypto via OpenSSL (see below), OpenSSL only uses CBC mode, which means that only decryption will see any significant performance improvement w/ my changes... Also, how many people know to load the kernel module cryptodev and aesni to use it? OpenSSL should use AES-NI natively instead of using the cryptodev engine since it'll eliminate kernel calls, etc... Hmm... looking at the source for openssl speed, it looks like it only measures CBC encryption speed, not decryption speed, so you won't be able to see the performance difference between encryption and decryption for CBC mode... I'm not sure that even if you set cryptodev engine on -HEAD that it actually uses it... I just did a: ktrace openssl speed -engine cryptodev aes-128-cbc But this is part of the trace: 4377 openssl CALL ioctl(0x4,CIOCGSESSION,0x7fff9ac0) 4377 openssl RET ioctl -1 errno 22 Invalid argument 4377 openssl CALL close(0x4) 4377 openssl RET close 0 4377 openssl CALL write(0x2,0x7fff9250,0x18) 4377 openssl GIO fd 2 wrote 24 bytes engine cryptodev set. 4377 openssl RET write 24/0x18 4377 openssl CALL sigaction(SIGALRM,0x7fff9b60,0x7fff9b40) 4377 openssl RET sigaction 0 4377 openssl CALL write(0x2,0x7fff9270,0x2c) 4377 openssl GIO fd 2 wrote 44 bytes Doing aes-128 cbc for 3s on 16 size blocks: 4377 openssl RET write 44/0x2c 4377 openssl CALL setitimer(0,0x7fff9b70,0x7fff9b50) 4377 openssl RET setitimer 0 4377 openssl CALL getrusage(0,0x7fff9ab0) 4377 openssl RET getrusage 0 4377 openssl CALL getrusage(0x,0x7fff9ab0) 4377 openssl RET getrusage 0 4377 openssl PSIG SIGALRM caught handler=0x451550 mask=0x0 code=SI_KERNEL 4377 openssl CALL sigaction(SIGALRM,0x7fff9520,0x7fff9500) 4377 openssl RET sigaction 0 4377 openssl CALL sigreturn(0x7fff9580) 4377 openssl RET sigreturn JUSTRETURN 4377 openssl CALL getrusage(0,0x7fff9ab0) 4377 openssl RET getrusage 0 4377 openssl CALL getrusage(0x,0x7fff9ab0) 4377 openssl RET getrusage 0 4377 openssl CALL write(0x2,0x7fff9250,0x20) 4377 openssl GIO fd 2 wrote 32 bytes 18712524 aes-128 cbc's in 2.99s Shouldn't there be a whole host of ioctl's between the Doing print and the x aes-128 cbc's print if it was doing cryptodev? This does explain why the numbers w/ both engine cryptodev and w/o (kernel module unloaded, so not possible to use) were similar... (trimming security@ from the cc: list as it's an alias for secteam@ which is not the appropriate venue for this discussion.) DES -- Dag-Erling Smørgrav - d...@des.no ___ freebsd-current@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-current To unsubscribe, send any mail to freebsd-current-unsubscr...@freebsd.org -- John-Mark Gurney Voice: +1 415 225 5579 All that I will do, has been done, All that I have, has not. ___ freebsd-current@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-current To unsubscribe, send any mail to freebsd-current-unsubscr...@freebsd.org
patch to improve AES-NI performance
I have developed a patch to improve AES-NI performance. If you took the AES-XTS algorithm into userland (no cryptodev or geli usage), these changes improve the performance over 10x in my tests (from ~150MB/sec to over 2GB/sec). In tests of geli on gnop, the performance improvement is more moderate, around 4x due to overhead in other parts of the system. This is patch will be committed after the gcc intrinsics patch so that kernels will continue to compile w/ both clang and gcc w/o change. I have tested both AES-XTS and AES-CBC mode of geli and verified no difference between this and software mode. I plan to commit the test scripts for this in the future too. I have validated the AES-XTS via cryptodev against the standard test vectors and all the block sized vectors pass. The non-block sized test vectors cannot pass since our cryptodev implementation only allows block sized requests. Thanks to Mike Hamburg for help and advice in making the AES-XTS algorithm go really fast. The patch removes some assembly, and also replaces some hard coded instructions (as .byte values) to their proper instructions now that gcc can assemble them properly. The patch: https://people.freebsd.org/~jmg/aesni.new1.patch -- John-Mark Gurney Voice: +1 415 225 5579 All that I will do, has been done, All that I have, has not. ___ freebsd-current@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-current To unsubscribe, send any mail to freebsd-current-unsubscr...@freebsd.org