[Touch-packages] [Bug 1674399] Re: OpenSSL CPU detection for AMD Ryzen CPUs
Additionally, I just noticed that LP: #1696863 is about a "ubuntu4.6" failure on Precise : InstallationMedia: Ubuntu 12.04.2 LTS "Precise Pangolin" - Release i386 (20130213) RelatedPackageVersions: dpkg 1.18.4ubuntu1.2 apt 1.2.20 SourcePackage: openssl Title: package libssl1.0.0:i386 1.0.2g-1ubuntu4.6 failed to install/upgrade: Which is seems odd to me since "ubuntu4.6" is for Trusty only and precise has "1.0.1-4ubuntu5.39" version. -- You received this bug notification because you are a member of Ubuntu Touch seeded packages, which is subscribed to openssl in Ubuntu. https://bugs.launchpad.net/bugs/1674399 Title: OpenSSL CPU detection for AMD Ryzen CPUs Status in openssl package in Ubuntu: Fix Released Status in openssl source package in Xenial: Fix Released Status in openssl source package in Yakkety: Fix Released Status in openssl source package in Zesty: Fix Released Status in openssl source package in Artful: Fix Released Bug description: [Impact] * Context: AMD added support in their processors for SHA Extensions[1] (CPU flag: sha_ni[2]) starting with Ryzen[3] CPU. Note that Ryzen CPU come in 64bit only (Confirmed with AMD representative). Current OpenSSL version in Ryzens still calls SHA for SSSE3 routine as result a number of extensions were effectively masked on Ryzen and shows no improvement. [1] /proc/cpuinfo processor : 0 vendor_id : AuthenticAMD cpu family : 23 model : 1 model name : AMD Ryzen 5 1600 Six-Core Processor flags : fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush mmx fxsr sse sse2 ht syscall nx mmxext fxsr_opt pdpe1gb rdtscp lm constant_tsc rep_good nopl nonstop_tsc extd_apicid aperfmperf eagerfpu pni pclmulqdq monitor ssse3 fma cx16 sse 4_1 sse4_2 movbe popcnt aes xsave avx f16c rdrand lahf_lm cmp_legacy svm extapic cr8_legacy abm sse4a misalignsse 3dnowprefetch osvw skinit wdt tce topoext perfctr_core perfctr_nb bpext perfctr_l2 mwaitx hw_pstate vmmcall fsgsbase bmi1 avx2 smep bmi2 rdseed adx smap clflusho pt sha_ni xsaveopt xsavec xgetbv1 clzero arat npt lbrv svm_lock nrip_save tsc_scale vmcb_clean flushbyasid decodeassists pausefilter pfthreshold [2] - sha_ni: SHA1/SHA256 Instruction Extensions [3] - https://en.wikipedia.org/wiki/Ryzen ... All models support: x87, MMX, SSE, SSE2, SSE3, SSSE3, SSE4.1, SSE4.2, AES, CLMUL, AVX, AVX2, FMA, CVT16/F16C, ABM, BMI1, BMI2, SHA.[5] ... * Program to performs the CPUID check: Reference : https://software.intel.com/en-us/articles/intel-sha-extensions ... Availability of the Intel® SHA Extensions on a particular processor can be determined by checking the SHA CPUID bit in CPUID.(EAX=07H, ECX=0):EBX.SHA [bit 29]. The following C function, using inline assembly, performs the CPUID check: -- int CheckForIntelShaExtensions() { int a, b, c, d; // Look for CPUID.7.0.EBX[29] // EAX = 7, ECX = 0 a = 7; c = 0; asm volatile ("cpuid" :"=a"(a), "=b"(b), "=c"(c), "=d"(d) :"a"(a), "c"(c) ); // Intel® SHA Extensions feature bit is EBX[29] return ((b >> 29) & 1); } -- On CPU with sha_ni the program return "1". Otherwise it return "0". [Test Case] * Reproducible with Xenial/Zesty/Artful release. * Generated a checksum of a big file (e.g. 5GB file) with openssl $ time /usr/bin/openssl dgst -sha256 /var/tmp/5Gfile SHA256(/var/tmp/5Gfile)= 8d448d81521cbc1bfdc04dd199d448bd3c49374221007bd0846d8d39a70dd4f8 real 0m12.835s user 0m12.344s sys 0m0.484s * Openssl speed $ openssl speed sha1 Doing sha1 for 3s on 16 size blocks: 9969152 sha1's in 3.00s Doing sha1 for 3s on 64 size blocks: 8019164 sha1's in 3.00s Doing sha1 for 3s on 256 size blocks: 5254219 sha1's in 2.99s Doing sha1 for 3s on 1024 size blocks: 2217067 sha1's in 3.00s Doing sha1 for 3s on 8192 size blocks: 347842 sha1's in 3.00s OpenSSL 1.0.2g 1 Mar 2016 built on: reproducible build, date unspecified options:bn(64,64) rc4(8x,int) des(idx,cisc,16,int) aes(partial) idea(int) blowfish(idx) compiler: gcc -I. -I.. -I../include -DOPENSSL_THREADS -D_REENTRANT -DDSO_DLFCN -DHAVE_DLFCN_H -Wa,--noexecstack -m64 -DL_ENDIAN -O3 -Wall -DOPENSSL_IA32_SSE2 -DOPENSSL_BN_ASM_MONT -DOPENSSL_BN_ASM_MONT5 -DOPENSSL_BN_ASM_GF2m -DSHA1_ASM -DSHA256_ASM -DSHA512_ASM -DMD5_ASM -DAES_ASM -DVPAES_ASM -DBSAES_ASM -DWHIRLPOOL_ASM -DGHASH_ASM -DECP_NISTZ256_ASM The 'numbers' are in 1000s of bytes per second processed. type 16 bytes 64 bytes 256 bytes 1024 bytes 8192 bytes sha1 53168.81k 171075.50k 449859.55k 756758.87k 949840.55 The performance are clearly better when using the patch which take benefit of the sha extension. (See Regression Potential section for result with patch) [Regression Potential] * Note : IRC discussion with infinity : https://bugs.launchpad.net/ubuntu/xenial/+source/openssl/+bug/1674399/comments/8 * Note from irc discuss
[Touch-packages] [Bug 1674399] Re: OpenSSL CPU detection for AMD Ryzen CPUs
Seth, I don't think it is related ... 3 of them are related to package(s) prior this SRU on version "1.0.2g-1ubuntu4.6" release on "Mon, 30 Jan 2017" and only 1 on "1.0.2g-1ubuntu4.8" for Xenial & another on on "libssl1.0.0" for Zesty. Title: package openssl 1.0.2g-1ubuntu4.6 failed to install/upgrade 1697099 Title: package libssl-dev:amd64 1.0.2g-1ubuntu11.2 failed to install/upgrade 1692981 Title: package libssl1.0.0:i386 1.0.2g-1ubuntu4.8 failed to install/upgrade: 1696799 Title: package libssl1.0.0:i386 1.0.2g-1ubuntu4.6 failed to install/upgrade: 1696863 1696930 I also did a couple of upgrade from "ubuntu4.6" to "ubuntu4.8" inside amd64 and i386 lxc container myself during the SRU and again today, and I haven't been able to reproduce the situation. As per our irc conversation, it is probably just being some random pile of noise, but let's keep an eye on it in the next few days. - Eric -- You received this bug notification because you are a member of Ubuntu Touch seeded packages, which is subscribed to openssl in Ubuntu. https://bugs.launchpad.net/bugs/1674399 Title: OpenSSL CPU detection for AMD Ryzen CPUs Status in openssl package in Ubuntu: Fix Released Status in openssl source package in Xenial: Fix Released Status in openssl source package in Yakkety: Fix Released Status in openssl source package in Zesty: Fix Released Status in openssl source package in Artful: Fix Released Bug description: [Impact] * Context: AMD added support in their processors for SHA Extensions[1] (CPU flag: sha_ni[2]) starting with Ryzen[3] CPU. Note that Ryzen CPU come in 64bit only (Confirmed with AMD representative). Current OpenSSL version in Ryzens still calls SHA for SSSE3 routine as result a number of extensions were effectively masked on Ryzen and shows no improvement. [1] /proc/cpuinfo processor : 0 vendor_id : AuthenticAMD cpu family : 23 model : 1 model name : AMD Ryzen 5 1600 Six-Core Processor flags : fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush mmx fxsr sse sse2 ht syscall nx mmxext fxsr_opt pdpe1gb rdtscp lm constant_tsc rep_good nopl nonstop_tsc extd_apicid aperfmperf eagerfpu pni pclmulqdq monitor ssse3 fma cx16 sse 4_1 sse4_2 movbe popcnt aes xsave avx f16c rdrand lahf_lm cmp_legacy svm extapic cr8_legacy abm sse4a misalignsse 3dnowprefetch osvw skinit wdt tce topoext perfctr_core perfctr_nb bpext perfctr_l2 mwaitx hw_pstate vmmcall fsgsbase bmi1 avx2 smep bmi2 rdseed adx smap clflusho pt sha_ni xsaveopt xsavec xgetbv1 clzero arat npt lbrv svm_lock nrip_save tsc_scale vmcb_clean flushbyasid decodeassists pausefilter pfthreshold [2] - sha_ni: SHA1/SHA256 Instruction Extensions [3] - https://en.wikipedia.org/wiki/Ryzen ... All models support: x87, MMX, SSE, SSE2, SSE3, SSSE3, SSE4.1, SSE4.2, AES, CLMUL, AVX, AVX2, FMA, CVT16/F16C, ABM, BMI1, BMI2, SHA.[5] ... * Program to performs the CPUID check: Reference : https://software.intel.com/en-us/articles/intel-sha-extensions ... Availability of the Intel® SHA Extensions on a particular processor can be determined by checking the SHA CPUID bit in CPUID.(EAX=07H, ECX=0):EBX.SHA [bit 29]. The following C function, using inline assembly, performs the CPUID check: -- int CheckForIntelShaExtensions() { int a, b, c, d; // Look for CPUID.7.0.EBX[29] // EAX = 7, ECX = 0 a = 7; c = 0; asm volatile ("cpuid" :"=a"(a), "=b"(b), "=c"(c), "=d"(d) :"a"(a), "c"(c) ); // Intel® SHA Extensions feature bit is EBX[29] return ((b >> 29) & 1); } -- On CPU with sha_ni the program return "1". Otherwise it return "0". [Test Case] * Reproducible with Xenial/Zesty/Artful release. * Generated a checksum of a big file (e.g. 5GB file) with openssl $ time /usr/bin/openssl dgst -sha256 /var/tmp/5Gfile SHA256(/var/tmp/5Gfile)= 8d448d81521cbc1bfdc04dd199d448bd3c49374221007bd0846d8d39a70dd4f8 real 0m12.835s user 0m12.344s sys 0m0.484s * Openssl speed $ openssl speed sha1 Doing sha1 for 3s on 16 size blocks: 9969152 sha1's in 3.00s Doing sha1 for 3s on 64 size blocks: 8019164 sha1's in 3.00s Doing sha1 for 3s on 256 size blocks: 5254219 sha1's in 2.99s Doing sha1 for 3s on 1024 size blocks: 2217067 sha1's in 3.00s Doing sha1 for 3s on 8192 size blocks: 347842 sha1's in 3.00s OpenSSL 1.0.2g 1 Mar 2016 built on: reproducible build, date unspecified options:bn(64,64) rc4(8x,int) des(idx,cisc,16,int) aes(partial) idea(int) blowfish(idx) compiler: gcc -I. -I.. -I../include -DOPENSSL_THREADS -D_REENTRANT -DDSO_DLFCN -DHAVE_DLFCN_H -Wa,--noexecstack -m64 -DL_ENDIAN -O3 -Wall -DOPENSSL_IA32_SSE2 -DOPENSSL_BN_ASM_MONT -DOPENSSL_BN_ASM_MONT5 -DOPENSSL_BN_ASM_GF2m -DSHA1_ASM -DSHA256_ASM -DSHA512_ASM -DMD5_ASM -DAES_ASM -DVPAES_ASM -DBSAES_ASM -DWHIRLPOOL_ASM -DGHASH_ASM -DECP_NISTZ256_ASM The 'numbers' are in 1000s of bytes per seco
[Touch-packages] [Bug 1674399] Re: OpenSSL CPU detection for AMD Ryzen CPUs
Some of the bug reports in my previous comment are actually filed against previous versions of openssl (despite all being filed recently). Eric noted that only 1692981 (zesty) and 1696799 (xenial) are for this SRU, and 1696863 1697099 1696930 are for previous updates. I don't think there's an issue with this SRU; this just appears to be usual upgrade failure noise with unfortunate timing making me think there may be a pattern here. Thanks, and thanks to Eric for the extra research late on a Friday. -- You received this bug notification because you are a member of Ubuntu Touch seeded packages, which is subscribed to openssl in Ubuntu. https://bugs.launchpad.net/bugs/1674399 Title: OpenSSL CPU detection for AMD Ryzen CPUs Status in openssl package in Ubuntu: Fix Released Status in openssl source package in Xenial: Fix Released Status in openssl source package in Yakkety: Fix Released Status in openssl source package in Zesty: Fix Released Status in openssl source package in Artful: Fix Released Bug description: [Impact] * Context: AMD added support in their processors for SHA Extensions[1] (CPU flag: sha_ni[2]) starting with Ryzen[3] CPU. Note that Ryzen CPU come in 64bit only (Confirmed with AMD representative). Current OpenSSL version in Ryzens still calls SHA for SSSE3 routine as result a number of extensions were effectively masked on Ryzen and shows no improvement. [1] /proc/cpuinfo processor : 0 vendor_id : AuthenticAMD cpu family : 23 model : 1 model name : AMD Ryzen 5 1600 Six-Core Processor flags : fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush mmx fxsr sse sse2 ht syscall nx mmxext fxsr_opt pdpe1gb rdtscp lm constant_tsc rep_good nopl nonstop_tsc extd_apicid aperfmperf eagerfpu pni pclmulqdq monitor ssse3 fma cx16 sse 4_1 sse4_2 movbe popcnt aes xsave avx f16c rdrand lahf_lm cmp_legacy svm extapic cr8_legacy abm sse4a misalignsse 3dnowprefetch osvw skinit wdt tce topoext perfctr_core perfctr_nb bpext perfctr_l2 mwaitx hw_pstate vmmcall fsgsbase bmi1 avx2 smep bmi2 rdseed adx smap clflusho pt sha_ni xsaveopt xsavec xgetbv1 clzero arat npt lbrv svm_lock nrip_save tsc_scale vmcb_clean flushbyasid decodeassists pausefilter pfthreshold [2] - sha_ni: SHA1/SHA256 Instruction Extensions [3] - https://en.wikipedia.org/wiki/Ryzen ... All models support: x87, MMX, SSE, SSE2, SSE3, SSSE3, SSE4.1, SSE4.2, AES, CLMUL, AVX, AVX2, FMA, CVT16/F16C, ABM, BMI1, BMI2, SHA.[5] ... * Program to performs the CPUID check: Reference : https://software.intel.com/en-us/articles/intel-sha-extensions ... Availability of the Intel® SHA Extensions on a particular processor can be determined by checking the SHA CPUID bit in CPUID.(EAX=07H, ECX=0):EBX.SHA [bit 29]. The following C function, using inline assembly, performs the CPUID check: -- int CheckForIntelShaExtensions() { int a, b, c, d; // Look for CPUID.7.0.EBX[29] // EAX = 7, ECX = 0 a = 7; c = 0; asm volatile ("cpuid" :"=a"(a), "=b"(b), "=c"(c), "=d"(d) :"a"(a), "c"(c) ); // Intel® SHA Extensions feature bit is EBX[29] return ((b >> 29) & 1); } -- On CPU with sha_ni the program return "1". Otherwise it return "0". [Test Case] * Reproducible with Xenial/Zesty/Artful release. * Generated a checksum of a big file (e.g. 5GB file) with openssl $ time /usr/bin/openssl dgst -sha256 /var/tmp/5Gfile SHA256(/var/tmp/5Gfile)= 8d448d81521cbc1bfdc04dd199d448bd3c49374221007bd0846d8d39a70dd4f8 real 0m12.835s user 0m12.344s sys 0m0.484s * Openssl speed $ openssl speed sha1 Doing sha1 for 3s on 16 size blocks: 9969152 sha1's in 3.00s Doing sha1 for 3s on 64 size blocks: 8019164 sha1's in 3.00s Doing sha1 for 3s on 256 size blocks: 5254219 sha1's in 2.99s Doing sha1 for 3s on 1024 size blocks: 2217067 sha1's in 3.00s Doing sha1 for 3s on 8192 size blocks: 347842 sha1's in 3.00s OpenSSL 1.0.2g 1 Mar 2016 built on: reproducible build, date unspecified options:bn(64,64) rc4(8x,int) des(idx,cisc,16,int) aes(partial) idea(int) blowfish(idx) compiler: gcc -I. -I.. -I../include -DOPENSSL_THREADS -D_REENTRANT -DDSO_DLFCN -DHAVE_DLFCN_H -Wa,--noexecstack -m64 -DL_ENDIAN -O3 -Wall -DOPENSSL_IA32_SSE2 -DOPENSSL_BN_ASM_MONT -DOPENSSL_BN_ASM_MONT5 -DOPENSSL_BN_ASM_GF2m -DSHA1_ASM -DSHA256_ASM -DSHA512_ASM -DMD5_ASM -DAES_ASM -DVPAES_ASM -DBSAES_ASM -DWHIRLPOOL_ASM -DGHASH_ASM -DECP_NISTZ256_ASM The 'numbers' are in 1000s of bytes per second processed. type 16 bytes 64 bytes 256 bytes 1024 bytes 8192 bytes sha1 53168.81k 171075.50k 449859.55k 756758.87k 949840.55 The performance are clearly better when using the patch which take benefit of the sha extension. (See Regression Potential section for result with patch) [Regression Potential] * Note : IRC discussion with infinity : https://bugs.launchpad.net/ubuntu/xenial/+sou
[Touch-packages] [Bug 1674399] Re: OpenSSL CPU detection for AMD Ryzen CPUs
William, Eric, I'm curious if this OpenSSL update may have introduced issues: https://bugs.launchpad.net/ubuntu/+source/openssl/+bug/1697099 https://bugs.launchpad.net/ubuntu/+source/openssl/+bug/1696930 https://bugs.launchpad.net/ubuntu/+source/openssl/+bug/1696863 https://bugs.launchpad.net/ubuntu/+source/openssl/+bug/1696799 https://bugs.launchpad.net/ubuntu/+source/openssl/+bug/1692981 My own upgrades on the command line went without troubles. Thanks -- You received this bug notification because you are a member of Ubuntu Touch seeded packages, which is subscribed to openssl in Ubuntu. https://bugs.launchpad.net/bugs/1674399 Title: OpenSSL CPU detection for AMD Ryzen CPUs Status in openssl package in Ubuntu: Fix Released Status in openssl source package in Xenial: Fix Released Status in openssl source package in Yakkety: Fix Released Status in openssl source package in Zesty: Fix Released Status in openssl source package in Artful: Fix Released Bug description: [Impact] * Context: AMD added support in their processors for SHA Extensions[1] (CPU flag: sha_ni[2]) starting with Ryzen[3] CPU. Note that Ryzen CPU come in 64bit only (Confirmed with AMD representative). Current OpenSSL version in Ryzens still calls SHA for SSSE3 routine as result a number of extensions were effectively masked on Ryzen and shows no improvement. [1] /proc/cpuinfo processor : 0 vendor_id : AuthenticAMD cpu family : 23 model : 1 model name : AMD Ryzen 5 1600 Six-Core Processor flags : fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush mmx fxsr sse sse2 ht syscall nx mmxext fxsr_opt pdpe1gb rdtscp lm constant_tsc rep_good nopl nonstop_tsc extd_apicid aperfmperf eagerfpu pni pclmulqdq monitor ssse3 fma cx16 sse 4_1 sse4_2 movbe popcnt aes xsave avx f16c rdrand lahf_lm cmp_legacy svm extapic cr8_legacy abm sse4a misalignsse 3dnowprefetch osvw skinit wdt tce topoext perfctr_core perfctr_nb bpext perfctr_l2 mwaitx hw_pstate vmmcall fsgsbase bmi1 avx2 smep bmi2 rdseed adx smap clflusho pt sha_ni xsaveopt xsavec xgetbv1 clzero arat npt lbrv svm_lock nrip_save tsc_scale vmcb_clean flushbyasid decodeassists pausefilter pfthreshold [2] - sha_ni: SHA1/SHA256 Instruction Extensions [3] - https://en.wikipedia.org/wiki/Ryzen ... All models support: x87, MMX, SSE, SSE2, SSE3, SSSE3, SSE4.1, SSE4.2, AES, CLMUL, AVX, AVX2, FMA, CVT16/F16C, ABM, BMI1, BMI2, SHA.[5] ... * Program to performs the CPUID check: Reference : https://software.intel.com/en-us/articles/intel-sha-extensions ... Availability of the Intel® SHA Extensions on a particular processor can be determined by checking the SHA CPUID bit in CPUID.(EAX=07H, ECX=0):EBX.SHA [bit 29]. The following C function, using inline assembly, performs the CPUID check: -- int CheckForIntelShaExtensions() { int a, b, c, d; // Look for CPUID.7.0.EBX[29] // EAX = 7, ECX = 0 a = 7; c = 0; asm volatile ("cpuid" :"=a"(a), "=b"(b), "=c"(c), "=d"(d) :"a"(a), "c"(c) ); // Intel® SHA Extensions feature bit is EBX[29] return ((b >> 29) & 1); } -- On CPU with sha_ni the program return "1". Otherwise it return "0". [Test Case] * Reproducible with Xenial/Zesty/Artful release. * Generated a checksum of a big file (e.g. 5GB file) with openssl $ time /usr/bin/openssl dgst -sha256 /var/tmp/5Gfile SHA256(/var/tmp/5Gfile)= 8d448d81521cbc1bfdc04dd199d448bd3c49374221007bd0846d8d39a70dd4f8 real 0m12.835s user 0m12.344s sys 0m0.484s * Openssl speed $ openssl speed sha1 Doing sha1 for 3s on 16 size blocks: 9969152 sha1's in 3.00s Doing sha1 for 3s on 64 size blocks: 8019164 sha1's in 3.00s Doing sha1 for 3s on 256 size blocks: 5254219 sha1's in 2.99s Doing sha1 for 3s on 1024 size blocks: 2217067 sha1's in 3.00s Doing sha1 for 3s on 8192 size blocks: 347842 sha1's in 3.00s OpenSSL 1.0.2g 1 Mar 2016 built on: reproducible build, date unspecified options:bn(64,64) rc4(8x,int) des(idx,cisc,16,int) aes(partial) idea(int) blowfish(idx) compiler: gcc -I. -I.. -I../include -DOPENSSL_THREADS -D_REENTRANT -DDSO_DLFCN -DHAVE_DLFCN_H -Wa,--noexecstack -m64 -DL_ENDIAN -O3 -Wall -DOPENSSL_IA32_SSE2 -DOPENSSL_BN_ASM_MONT -DOPENSSL_BN_ASM_MONT5 -DOPENSSL_BN_ASM_GF2m -DSHA1_ASM -DSHA256_ASM -DSHA512_ASM -DMD5_ASM -DAES_ASM -DVPAES_ASM -DBSAES_ASM -DWHIRLPOOL_ASM -DGHASH_ASM -DECP_NISTZ256_ASM The 'numbers' are in 1000s of bytes per second processed. type 16 bytes 64 bytes 256 bytes 1024 bytes 8192 bytes sha1 53168.81k 171075.50k 449859.55k 756758.87k 949840.55 The performance are clearly better when using the patch which take benefit of the sha extension. (See Regression Potential section for result with patch) [Regression Potential] * Note : IRC discussion with infinity : https://bugs.launchpad.net/ubuntu/xenial/+source/openssl/+bug/1674399/comments/8 * No
[Touch-packages] [Bug 1674399] Re: OpenSSL CPU detection for AMD Ryzen CPUs
This bug was fixed in the package openssl - 1.0.2g-1ubuntu9.3 --- openssl (1.0.2g-1ubuntu9.3) yakkety; urgency=medium * aes/asm/aesni-sha*-x86_64.pl: fix IV handling in SHAEXT paths. (LP: #1674399) -- William Grant Fri, 19 May 2017 18:25:11 +1000 ** Changed in: openssl (Ubuntu Yakkety) Status: Fix Committed => Fix Released ** Changed in: openssl (Ubuntu Zesty) Status: Fix Committed => Fix Released -- You received this bug notification because you are a member of Ubuntu Touch seeded packages, which is subscribed to openssl in Ubuntu. https://bugs.launchpad.net/bugs/1674399 Title: OpenSSL CPU detection for AMD Ryzen CPUs Status in openssl package in Ubuntu: Fix Released Status in openssl source package in Xenial: Fix Released Status in openssl source package in Yakkety: Fix Released Status in openssl source package in Zesty: Fix Released Status in openssl source package in Artful: Fix Released Bug description: [Impact] * Context: AMD added support in their processors for SHA Extensions[1] (CPU flag: sha_ni[2]) starting with Ryzen[3] CPU. Note that Ryzen CPU come in 64bit only (Confirmed with AMD representative). Current OpenSSL version in Ryzens still calls SHA for SSSE3 routine as result a number of extensions were effectively masked on Ryzen and shows no improvement. [1] /proc/cpuinfo processor : 0 vendor_id : AuthenticAMD cpu family : 23 model : 1 model name : AMD Ryzen 5 1600 Six-Core Processor flags : fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush mmx fxsr sse sse2 ht syscall nx mmxext fxsr_opt pdpe1gb rdtscp lm constant_tsc rep_good nopl nonstop_tsc extd_apicid aperfmperf eagerfpu pni pclmulqdq monitor ssse3 fma cx16 sse 4_1 sse4_2 movbe popcnt aes xsave avx f16c rdrand lahf_lm cmp_legacy svm extapic cr8_legacy abm sse4a misalignsse 3dnowprefetch osvw skinit wdt tce topoext perfctr_core perfctr_nb bpext perfctr_l2 mwaitx hw_pstate vmmcall fsgsbase bmi1 avx2 smep bmi2 rdseed adx smap clflusho pt sha_ni xsaveopt xsavec xgetbv1 clzero arat npt lbrv svm_lock nrip_save tsc_scale vmcb_clean flushbyasid decodeassists pausefilter pfthreshold [2] - sha_ni: SHA1/SHA256 Instruction Extensions [3] - https://en.wikipedia.org/wiki/Ryzen ... All models support: x87, MMX, SSE, SSE2, SSE3, SSSE3, SSE4.1, SSE4.2, AES, CLMUL, AVX, AVX2, FMA, CVT16/F16C, ABM, BMI1, BMI2, SHA.[5] ... * Program to performs the CPUID check: Reference : https://software.intel.com/en-us/articles/intel-sha-extensions ... Availability of the Intel® SHA Extensions on a particular processor can be determined by checking the SHA CPUID bit in CPUID.(EAX=07H, ECX=0):EBX.SHA [bit 29]. The following C function, using inline assembly, performs the CPUID check: -- int CheckForIntelShaExtensions() { int a, b, c, d; // Look for CPUID.7.0.EBX[29] // EAX = 7, ECX = 0 a = 7; c = 0; asm volatile ("cpuid" :"=a"(a), "=b"(b), "=c"(c), "=d"(d) :"a"(a), "c"(c) ); // Intel® SHA Extensions feature bit is EBX[29] return ((b >> 29) & 1); } -- On CPU with sha_ni the program return "1". Otherwise it return "0". [Test Case] * Reproducible with Xenial/Zesty/Artful release. * Generated a checksum of a big file (e.g. 5GB file) with openssl $ time /usr/bin/openssl dgst -sha256 /var/tmp/5Gfile SHA256(/var/tmp/5Gfile)= 8d448d81521cbc1bfdc04dd199d448bd3c49374221007bd0846d8d39a70dd4f8 real 0m12.835s user 0m12.344s sys 0m0.484s * Openssl speed $ openssl speed sha1 Doing sha1 for 3s on 16 size blocks: 9969152 sha1's in 3.00s Doing sha1 for 3s on 64 size blocks: 8019164 sha1's in 3.00s Doing sha1 for 3s on 256 size blocks: 5254219 sha1's in 2.99s Doing sha1 for 3s on 1024 size blocks: 2217067 sha1's in 3.00s Doing sha1 for 3s on 8192 size blocks: 347842 sha1's in 3.00s OpenSSL 1.0.2g 1 Mar 2016 built on: reproducible build, date unspecified options:bn(64,64) rc4(8x,int) des(idx,cisc,16,int) aes(partial) idea(int) blowfish(idx) compiler: gcc -I. -I.. -I../include -DOPENSSL_THREADS -D_REENTRANT -DDSO_DLFCN -DHAVE_DLFCN_H -Wa,--noexecstack -m64 -DL_ENDIAN -O3 -Wall -DOPENSSL_IA32_SSE2 -DOPENSSL_BN_ASM_MONT -DOPENSSL_BN_ASM_MONT5 -DOPENSSL_BN_ASM_GF2m -DSHA1_ASM -DSHA256_ASM -DSHA512_ASM -DMD5_ASM -DAES_ASM -DVPAES_ASM -DBSAES_ASM -DWHIRLPOOL_ASM -DGHASH_ASM -DECP_NISTZ256_ASM The 'numbers' are in 1000s of bytes per second processed. type 16 bytes 64 bytes 256 bytes 1024 bytes 8192 bytes sha1 53168.81k 171075.50k 449859.55k 756758.87k 949840.55 The performance are clearly better when using the patch which take benefit of the sha extension. (See Regression Potential section for result with patch) [Regression Potential] * Note : IRC discussion with infinity : https://bugs.launchpad.net/ubuntu/xenial/+source/openssl/+bug/1674399/comments/8 * Note from irc discussio
[Touch-packages] [Bug 1674399] Re: OpenSSL CPU detection for AMD Ryzen CPUs
This bug was fixed in the package openssl - 1.0.2g-1ubuntu11.2 --- openssl (1.0.2g-1ubuntu11.2) zesty; urgency=medium * aes/asm/aesni-sha*-x86_64.pl: fix IV handling in SHAEXT paths. (LP: #1674399) -- William Grant Fri, 19 May 2017 18:29:44 +1000 -- You received this bug notification because you are a member of Ubuntu Touch seeded packages, which is subscribed to openssl in Ubuntu. https://bugs.launchpad.net/bugs/1674399 Title: OpenSSL CPU detection for AMD Ryzen CPUs Status in openssl package in Ubuntu: Fix Released Status in openssl source package in Xenial: Fix Released Status in openssl source package in Yakkety: Fix Released Status in openssl source package in Zesty: Fix Released Status in openssl source package in Artful: Fix Released Bug description: [Impact] * Context: AMD added support in their processors for SHA Extensions[1] (CPU flag: sha_ni[2]) starting with Ryzen[3] CPU. Note that Ryzen CPU come in 64bit only (Confirmed with AMD representative). Current OpenSSL version in Ryzens still calls SHA for SSSE3 routine as result a number of extensions were effectively masked on Ryzen and shows no improvement. [1] /proc/cpuinfo processor : 0 vendor_id : AuthenticAMD cpu family : 23 model : 1 model name : AMD Ryzen 5 1600 Six-Core Processor flags : fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush mmx fxsr sse sse2 ht syscall nx mmxext fxsr_opt pdpe1gb rdtscp lm constant_tsc rep_good nopl nonstop_tsc extd_apicid aperfmperf eagerfpu pni pclmulqdq monitor ssse3 fma cx16 sse 4_1 sse4_2 movbe popcnt aes xsave avx f16c rdrand lahf_lm cmp_legacy svm extapic cr8_legacy abm sse4a misalignsse 3dnowprefetch osvw skinit wdt tce topoext perfctr_core perfctr_nb bpext perfctr_l2 mwaitx hw_pstate vmmcall fsgsbase bmi1 avx2 smep bmi2 rdseed adx smap clflusho pt sha_ni xsaveopt xsavec xgetbv1 clzero arat npt lbrv svm_lock nrip_save tsc_scale vmcb_clean flushbyasid decodeassists pausefilter pfthreshold [2] - sha_ni: SHA1/SHA256 Instruction Extensions [3] - https://en.wikipedia.org/wiki/Ryzen ... All models support: x87, MMX, SSE, SSE2, SSE3, SSSE3, SSE4.1, SSE4.2, AES, CLMUL, AVX, AVX2, FMA, CVT16/F16C, ABM, BMI1, BMI2, SHA.[5] ... * Program to performs the CPUID check: Reference : https://software.intel.com/en-us/articles/intel-sha-extensions ... Availability of the Intel® SHA Extensions on a particular processor can be determined by checking the SHA CPUID bit in CPUID.(EAX=07H, ECX=0):EBX.SHA [bit 29]. The following C function, using inline assembly, performs the CPUID check: -- int CheckForIntelShaExtensions() { int a, b, c, d; // Look for CPUID.7.0.EBX[29] // EAX = 7, ECX = 0 a = 7; c = 0; asm volatile ("cpuid" :"=a"(a), "=b"(b), "=c"(c), "=d"(d) :"a"(a), "c"(c) ); // Intel® SHA Extensions feature bit is EBX[29] return ((b >> 29) & 1); } -- On CPU with sha_ni the program return "1". Otherwise it return "0". [Test Case] * Reproducible with Xenial/Zesty/Artful release. * Generated a checksum of a big file (e.g. 5GB file) with openssl $ time /usr/bin/openssl dgst -sha256 /var/tmp/5Gfile SHA256(/var/tmp/5Gfile)= 8d448d81521cbc1bfdc04dd199d448bd3c49374221007bd0846d8d39a70dd4f8 real 0m12.835s user 0m12.344s sys 0m0.484s * Openssl speed $ openssl speed sha1 Doing sha1 for 3s on 16 size blocks: 9969152 sha1's in 3.00s Doing sha1 for 3s on 64 size blocks: 8019164 sha1's in 3.00s Doing sha1 for 3s on 256 size blocks: 5254219 sha1's in 2.99s Doing sha1 for 3s on 1024 size blocks: 2217067 sha1's in 3.00s Doing sha1 for 3s on 8192 size blocks: 347842 sha1's in 3.00s OpenSSL 1.0.2g 1 Mar 2016 built on: reproducible build, date unspecified options:bn(64,64) rc4(8x,int) des(idx,cisc,16,int) aes(partial) idea(int) blowfish(idx) compiler: gcc -I. -I.. -I../include -DOPENSSL_THREADS -D_REENTRANT -DDSO_DLFCN -DHAVE_DLFCN_H -Wa,--noexecstack -m64 -DL_ENDIAN -O3 -Wall -DOPENSSL_IA32_SSE2 -DOPENSSL_BN_ASM_MONT -DOPENSSL_BN_ASM_MONT5 -DOPENSSL_BN_ASM_GF2m -DSHA1_ASM -DSHA256_ASM -DSHA512_ASM -DMD5_ASM -DAES_ASM -DVPAES_ASM -DBSAES_ASM -DWHIRLPOOL_ASM -DGHASH_ASM -DECP_NISTZ256_ASM The 'numbers' are in 1000s of bytes per second processed. type 16 bytes 64 bytes 256 bytes 1024 bytes 8192 bytes sha1 53168.81k 171075.50k 449859.55k 756758.87k 949840.55 The performance are clearly better when using the patch which take benefit of the sha extension. (See Regression Potential section for result with patch) [Regression Potential] * Note : IRC discussion with infinity : https://bugs.launchpad.net/ubuntu/xenial/+source/openssl/+bug/1674399/comments/8 * Note from irc discussion with apw and rbasak : https://bugs.launchpad.net/ubuntu/+source/openssl/+bug/1674399/comments/2 * It basically allow openssl to take benefit of sha extension po
[Touch-packages] [Bug 1674399] Re: OpenSSL CPU detection for AMD Ryzen CPUs
This bug was fixed in the package openssl - 1.0.2g-1ubuntu4.8 --- openssl (1.0.2g-1ubuntu4.8) xenial; urgency=medium * aes/asm/aesni-sha*-x86_64.pl: fix IV handling in SHAEXT paths. (LP: #1674399) -- William Grant Fri, 19 May 2017 18:27:58 +1000 ** Changed in: openssl (Ubuntu Xenial) Status: Fix Committed => Fix Released -- You received this bug notification because you are a member of Ubuntu Touch seeded packages, which is subscribed to openssl in Ubuntu. https://bugs.launchpad.net/bugs/1674399 Title: OpenSSL CPU detection for AMD Ryzen CPUs Status in openssl package in Ubuntu: Fix Released Status in openssl source package in Xenial: Fix Released Status in openssl source package in Yakkety: Fix Released Status in openssl source package in Zesty: Fix Released Status in openssl source package in Artful: Fix Released Bug description: [Impact] * Context: AMD added support in their processors for SHA Extensions[1] (CPU flag: sha_ni[2]) starting with Ryzen[3] CPU. Note that Ryzen CPU come in 64bit only (Confirmed with AMD representative). Current OpenSSL version in Ryzens still calls SHA for SSSE3 routine as result a number of extensions were effectively masked on Ryzen and shows no improvement. [1] /proc/cpuinfo processor : 0 vendor_id : AuthenticAMD cpu family : 23 model : 1 model name : AMD Ryzen 5 1600 Six-Core Processor flags : fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush mmx fxsr sse sse2 ht syscall nx mmxext fxsr_opt pdpe1gb rdtscp lm constant_tsc rep_good nopl nonstop_tsc extd_apicid aperfmperf eagerfpu pni pclmulqdq monitor ssse3 fma cx16 sse 4_1 sse4_2 movbe popcnt aes xsave avx f16c rdrand lahf_lm cmp_legacy svm extapic cr8_legacy abm sse4a misalignsse 3dnowprefetch osvw skinit wdt tce topoext perfctr_core perfctr_nb bpext perfctr_l2 mwaitx hw_pstate vmmcall fsgsbase bmi1 avx2 smep bmi2 rdseed adx smap clflusho pt sha_ni xsaveopt xsavec xgetbv1 clzero arat npt lbrv svm_lock nrip_save tsc_scale vmcb_clean flushbyasid decodeassists pausefilter pfthreshold [2] - sha_ni: SHA1/SHA256 Instruction Extensions [3] - https://en.wikipedia.org/wiki/Ryzen ... All models support: x87, MMX, SSE, SSE2, SSE3, SSSE3, SSE4.1, SSE4.2, AES, CLMUL, AVX, AVX2, FMA, CVT16/F16C, ABM, BMI1, BMI2, SHA.[5] ... * Program to performs the CPUID check: Reference : https://software.intel.com/en-us/articles/intel-sha-extensions ... Availability of the Intel® SHA Extensions on a particular processor can be determined by checking the SHA CPUID bit in CPUID.(EAX=07H, ECX=0):EBX.SHA [bit 29]. The following C function, using inline assembly, performs the CPUID check: -- int CheckForIntelShaExtensions() { int a, b, c, d; // Look for CPUID.7.0.EBX[29] // EAX = 7, ECX = 0 a = 7; c = 0; asm volatile ("cpuid" :"=a"(a), "=b"(b), "=c"(c), "=d"(d) :"a"(a), "c"(c) ); // Intel® SHA Extensions feature bit is EBX[29] return ((b >> 29) & 1); } -- On CPU with sha_ni the program return "1". Otherwise it return "0". [Test Case] * Reproducible with Xenial/Zesty/Artful release. * Generated a checksum of a big file (e.g. 5GB file) with openssl $ time /usr/bin/openssl dgst -sha256 /var/tmp/5Gfile SHA256(/var/tmp/5Gfile)= 8d448d81521cbc1bfdc04dd199d448bd3c49374221007bd0846d8d39a70dd4f8 real 0m12.835s user 0m12.344s sys 0m0.484s * Openssl speed $ openssl speed sha1 Doing sha1 for 3s on 16 size blocks: 9969152 sha1's in 3.00s Doing sha1 for 3s on 64 size blocks: 8019164 sha1's in 3.00s Doing sha1 for 3s on 256 size blocks: 5254219 sha1's in 2.99s Doing sha1 for 3s on 1024 size blocks: 2217067 sha1's in 3.00s Doing sha1 for 3s on 8192 size blocks: 347842 sha1's in 3.00s OpenSSL 1.0.2g 1 Mar 2016 built on: reproducible build, date unspecified options:bn(64,64) rc4(8x,int) des(idx,cisc,16,int) aes(partial) idea(int) blowfish(idx) compiler: gcc -I. -I.. -I../include -DOPENSSL_THREADS -D_REENTRANT -DDSO_DLFCN -DHAVE_DLFCN_H -Wa,--noexecstack -m64 -DL_ENDIAN -O3 -Wall -DOPENSSL_IA32_SSE2 -DOPENSSL_BN_ASM_MONT -DOPENSSL_BN_ASM_MONT5 -DOPENSSL_BN_ASM_GF2m -DSHA1_ASM -DSHA256_ASM -DSHA512_ASM -DMD5_ASM -DAES_ASM -DVPAES_ASM -DBSAES_ASM -DWHIRLPOOL_ASM -DGHASH_ASM -DECP_NISTZ256_ASM The 'numbers' are in 1000s of bytes per second processed. type 16 bytes 64 bytes 256 bytes 1024 bytes 8192 bytes sha1 53168.81k 171075.50k 449859.55k 756758.87k 949840.55 The performance are clearly better when using the patch which take benefit of the sha extension. (See Regression Potential section for result with patch) [Regression Potential] * Note : IRC discussion with infinity : https://bugs.launchpad.net/ubuntu/xenial/+source/openssl/+bug/1674399/comments/8 * Note from irc discussion with apw and rbasak : https://bugs.launchpad.net/ubuntu/+source/openssl/+bug/16743
[Touch-packages] [Bug 1674399] Re: OpenSSL CPU detection for AMD Ryzen CPUs
@Brian Murray, @sil2100 To summarize : - wgrant did his verifications which are all successful - I did mine that are all still successful - I also reviewed all the regression failures reported on the pending sru page and have provided an explanation for all of them above in the SRU justification template. IMHO, Now I don't see any blockers for the SRU to complete, but I'll let you guys (SRU) take the final decision. Let us know if we can be of any help. - Eric ** Description changed: [Impact] * Context: AMD added support in their processors for SHA Extensions[1] (CPU flag: sha_ni[2]) starting with Ryzen[3] CPU. Note that Ryzen CPU come in 64bit only (Confirmed with AMD representative). Current OpenSSL version in Ryzens still calls SHA for SSSE3 routine as result a number of extensions were effectively masked on Ryzen and shows no improvement. [1] /proc/cpuinfo processor : 0 vendor_id : AuthenticAMD cpu family : 23 model : 1 model name : AMD Ryzen 5 1600 Six-Core Processor flags : fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush mmx fxsr sse sse2 ht syscall nx mmxext fxsr_opt pdpe1gb rdtscp lm constant_tsc rep_good nopl nonstop_tsc extd_apicid aperfmperf eagerfpu pni pclmulqdq monitor ssse3 fma cx16 sse 4_1 sse4_2 movbe popcnt aes xsave avx f16c rdrand lahf_lm cmp_legacy svm extapic cr8_legacy abm sse4a misalignsse 3dnowprefetch osvw skinit wdt tce topoext perfctr_core perfctr_nb bpext perfctr_l2 mwaitx hw_pstate vmmcall fsgsbase bmi1 avx2 smep bmi2 rdseed adx smap clflusho pt sha_ni xsaveopt xsavec xgetbv1 clzero arat npt lbrv svm_lock nrip_save tsc_scale vmcb_clean flushbyasid decodeassists pausefilter pfthreshold [2] - sha_ni: SHA1/SHA256 Instruction Extensions [3] - https://en.wikipedia.org/wiki/Ryzen ... All models support: x87, MMX, SSE, SSE2, SSE3, SSSE3, SSE4.1, SSE4.2, AES, CLMUL, AVX, AVX2, FMA, CVT16/F16C, ABM, BMI1, BMI2, SHA.[5] ... * Program to performs the CPUID check: Reference : https://software.intel.com/en-us/articles/intel-sha-extensions ... Availability of the Intel® SHA Extensions on a particular processor can be determined by checking the SHA CPUID bit in CPUID.(EAX=07H, ECX=0):EBX.SHA [bit 29]. The following C function, using inline assembly, performs the CPUID check: -- int CheckForIntelShaExtensions() { int a, b, c, d; // Look for CPUID.7.0.EBX[29] // EAX = 7, ECX = 0 a = 7; c = 0; asm volatile ("cpuid" :"=a"(a), "=b"(b), "=c"(c), "=d"(d) :"a"(a), "c"(c) ); // Intel® SHA Extensions feature bit is EBX[29] return ((b >> 29) & 1); } -- On CPU with sha_ni the program return "1". Otherwise it return "0". [Test Case] * Reproducible with Xenial/Zesty/Artful release. * Generated a checksum of a big file (e.g. 5GB file) with openssl $ time /usr/bin/openssl dgst -sha256 /var/tmp/5Gfile SHA256(/var/tmp/5Gfile)= 8d448d81521cbc1bfdc04dd199d448bd3c49374221007bd0846d8d39a70dd4f8 real 0m12.835s user 0m12.344s sys 0m0.484s * Openssl speed $ openssl speed sha1 Doing sha1 for 3s on 16 size blocks: 9969152 sha1's in 3.00s Doing sha1 for 3s on 64 size blocks: 8019164 sha1's in 3.00s Doing sha1 for 3s on 256 size blocks: 5254219 sha1's in 2.99s Doing sha1 for 3s on 1024 size blocks: 2217067 sha1's in 3.00s Doing sha1 for 3s on 8192 size blocks: 347842 sha1's in 3.00s OpenSSL 1.0.2g 1 Mar 2016 built on: reproducible build, date unspecified options:bn(64,64) rc4(8x,int) des(idx,cisc,16,int) aes(partial) idea(int) blowfish(idx) compiler: gcc -I. -I.. -I../include -DOPENSSL_THREADS -D_REENTRANT -DDSO_DLFCN -DHAVE_DLFCN_H -Wa,--noexecstack -m64 -DL_ENDIAN -O3 -Wall -DOPENSSL_IA32_SSE2 -DOPENSSL_BN_ASM_MONT -DOPENSSL_BN_ASM_MONT5 -DOPENSSL_BN_ASM_GF2m -DSHA1_ASM -DSHA256_ASM -DSHA512_ASM -DMD5_ASM -DAES_ASM -DVPAES_ASM -DBSAES_ASM -DWHIRLPOOL_ASM -DGHASH_ASM -DECP_NISTZ256_ASM The 'numbers' are in 1000s of bytes per second processed. type 16 bytes 64 bytes 256 bytes 1024 bytes 8192 bytes sha1 53168.81k 171075.50k 449859.55k 756758.87k 949840.55 The performance are clearly better when using the patch which take benefit of the sha extension. (See Regression Potential section for result with patch) [Regression Potential] * Note : IRC discussion with infinity : https://bugs.launchpad.net/ubuntu/xenial/+source/openssl/+bug/1674399/comments/8 * Note from irc discussion with apw and rbasak : https://bugs.launchpad.net/ubuntu/+source/openssl/+bug/1674399/comments/2 * It basically allow openssl to take benefit of sha extension potential (mostly performance-wise) now that new AMD cpu starting to have the capability. * The code check the CPUID bit to determine if the sha instructions are available are not. * Maintainer comment proves that he did the successfully tested on Intel
[Touch-packages] [Bug 1674399] Re: OpenSSL CPU detection for AMD Ryzen CPUs
@Brian Murray (brian-murray) > Eric - Have you tested the new version of openssl which William uploaded? I did and my test cases results are still the same as the one pre-wgrant SRU. All good on my side. IMHO, the only things left to check are the 2-3 regressions in X and Y for postgresql (X,Y) and mongodb (Y) on armhf architecture. So far at first glance I can't see how this is failure are related to our openssl changes, but I'll look at it more in deep. The rest of the regressions found are explained in the description above in the "Regression section" where it says : * autopktest failure. - Eric -- You received this bug notification because you are a member of Ubuntu Touch seeded packages, which is subscribed to openssl in Ubuntu. https://bugs.launchpad.net/bugs/1674399 Title: OpenSSL CPU detection for AMD Ryzen CPUs Status in openssl package in Ubuntu: Fix Released Status in openssl source package in Xenial: Fix Committed Status in openssl source package in Yakkety: Fix Committed Status in openssl source package in Zesty: Fix Committed Status in openssl source package in Artful: Fix Released Bug description: [Impact] * Context: AMD added support in their processors for SHA Extensions[1] (CPU flag: sha_ni[2]) starting with Ryzen[3] CPU. Note that Ryzen CPU come in 64bit only (Confirmed with AMD representative). Current OpenSSL version in Ryzens still calls SHA for SSSE3 routine as result a number of extensions were effectively masked on Ryzen and shows no improvement. [1] /proc/cpuinfo processor : 0 vendor_id : AuthenticAMD cpu family : 23 model : 1 model name : AMD Ryzen 5 1600 Six-Core Processor flags : fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush mmx fxsr sse sse2 ht syscall nx mmxext fxsr_opt pdpe1gb rdtscp lm constant_tsc rep_good nopl nonstop_tsc extd_apicid aperfmperf eagerfpu pni pclmulqdq monitor ssse3 fma cx16 sse 4_1 sse4_2 movbe popcnt aes xsave avx f16c rdrand lahf_lm cmp_legacy svm extapic cr8_legacy abm sse4a misalignsse 3dnowprefetch osvw skinit wdt tce topoext perfctr_core perfctr_nb bpext perfctr_l2 mwaitx hw_pstate vmmcall fsgsbase bmi1 avx2 smep bmi2 rdseed adx smap clflusho pt sha_ni xsaveopt xsavec xgetbv1 clzero arat npt lbrv svm_lock nrip_save tsc_scale vmcb_clean flushbyasid decodeassists pausefilter pfthreshold [2] - sha_ni: SHA1/SHA256 Instruction Extensions [3] - https://en.wikipedia.org/wiki/Ryzen ... All models support: x87, MMX, SSE, SSE2, SSE3, SSSE3, SSE4.1, SSE4.2, AES, CLMUL, AVX, AVX2, FMA, CVT16/F16C, ABM, BMI1, BMI2, SHA.[5] ... * Program to performs the CPUID check: Reference : https://software.intel.com/en-us/articles/intel-sha-extensions ... Availability of the Intel® SHA Extensions on a particular processor can be determined by checking the SHA CPUID bit in CPUID.(EAX=07H, ECX=0):EBX.SHA [bit 29]. The following C function, using inline assembly, performs the CPUID check: -- int CheckForIntelShaExtensions() { int a, b, c, d; // Look for CPUID.7.0.EBX[29] // EAX = 7, ECX = 0 a = 7; c = 0; asm volatile ("cpuid" :"=a"(a), "=b"(b), "=c"(c), "=d"(d) :"a"(a), "c"(c) ); // Intel® SHA Extensions feature bit is EBX[29] return ((b >> 29) & 1); } -- On CPU with sha_ni the program return "1". Otherwise it return "0". [Test Case] * Reproducible with Xenial/Zesty/Artful release. * Generated a checksum of a big file (e.g. 5GB file) with openssl $ time /usr/bin/openssl dgst -sha256 /var/tmp/5Gfile SHA256(/var/tmp/5Gfile)= 8d448d81521cbc1bfdc04dd199d448bd3c49374221007bd0846d8d39a70dd4f8 real 0m12.835s user 0m12.344s sys 0m0.484s * Openssl speed $ openssl speed sha1 Doing sha1 for 3s on 16 size blocks: 9969152 sha1's in 3.00s Doing sha1 for 3s on 64 size blocks: 8019164 sha1's in 3.00s Doing sha1 for 3s on 256 size blocks: 5254219 sha1's in 2.99s Doing sha1 for 3s on 1024 size blocks: 2217067 sha1's in 3.00s Doing sha1 for 3s on 8192 size blocks: 347842 sha1's in 3.00s OpenSSL 1.0.2g 1 Mar 2016 built on: reproducible build, date unspecified options:bn(64,64) rc4(8x,int) des(idx,cisc,16,int) aes(partial) idea(int) blowfish(idx) compiler: gcc -I. -I.. -I../include -DOPENSSL_THREADS -D_REENTRANT -DDSO_DLFCN -DHAVE_DLFCN_H -Wa,--noexecstack -m64 -DL_ENDIAN -O3 -Wall -DOPENSSL_IA32_SSE2 -DOPENSSL_BN_ASM_MONT -DOPENSSL_BN_ASM_MONT5 -DOPENSSL_BN_ASM_GF2m -DSHA1_ASM -DSHA256_ASM -DSHA512_ASM -DMD5_ASM -DAES_ASM -DVPAES_ASM -DBSAES_ASM -DWHIRLPOOL_ASM -DGHASH_ASM -DECP_NISTZ256_ASM The 'numbers' are in 1000s of bytes per second processed. type 16 bytes 64 bytes 256 bytes 1024 bytes 8192 bytes sha1 53168.81k 171075.50k 449859.55k 756758.87k 949840.55 The performance are clearly better when using the patch which take benefit of the sha extension. (See Regression Potential section for result with patch) [Regre
[Touch-packages] [Bug 1674399] Re: OpenSSL CPU detection for AMD Ryzen CPUs
I've been running the fixed OpenSSL on my artful Ryzen desktop, zesty Haswell-U laptop and xenial Sandy Bridge server and its VMs for two weeks without incident. These hosts include a variety of VPN clients, HTTPS clients, HTTPS servers, SSH clients and servers, etc. The SRUs for all series build successfully on i386 and amd64 on Ryzen and Apollo Lake (I managed to acquire some hardware), the two shipping platforms that were exposed to the previous regression, and which FTBFS with the original SRU due to the test suite picking up the bug. -- You received this bug notification because you are a member of Ubuntu Touch seeded packages, which is subscribed to openssl in Ubuntu. https://bugs.launchpad.net/bugs/1674399 Title: OpenSSL CPU detection for AMD Ryzen CPUs Status in openssl package in Ubuntu: Fix Released Status in openssl source package in Xenial: Fix Committed Status in openssl source package in Yakkety: Fix Committed Status in openssl source package in Zesty: Fix Committed Status in openssl source package in Artful: Fix Released Bug description: [Impact] * Context: AMD added support in their processors for SHA Extensions[1] (CPU flag: sha_ni[2]) starting with Ryzen[3] CPU. Note that Ryzen CPU come in 64bit only (Confirmed with AMD representative). Current OpenSSL version in Ryzens still calls SHA for SSSE3 routine as result a number of extensions were effectively masked on Ryzen and shows no improvement. [1] /proc/cpuinfo processor : 0 vendor_id : AuthenticAMD cpu family : 23 model : 1 model name : AMD Ryzen 5 1600 Six-Core Processor flags : fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush mmx fxsr sse sse2 ht syscall nx mmxext fxsr_opt pdpe1gb rdtscp lm constant_tsc rep_good nopl nonstop_tsc extd_apicid aperfmperf eagerfpu pni pclmulqdq monitor ssse3 fma cx16 sse 4_1 sse4_2 movbe popcnt aes xsave avx f16c rdrand lahf_lm cmp_legacy svm extapic cr8_legacy abm sse4a misalignsse 3dnowprefetch osvw skinit wdt tce topoext perfctr_core perfctr_nb bpext perfctr_l2 mwaitx hw_pstate vmmcall fsgsbase bmi1 avx2 smep bmi2 rdseed adx smap clflusho pt sha_ni xsaveopt xsavec xgetbv1 clzero arat npt lbrv svm_lock nrip_save tsc_scale vmcb_clean flushbyasid decodeassists pausefilter pfthreshold [2] - sha_ni: SHA1/SHA256 Instruction Extensions [3] - https://en.wikipedia.org/wiki/Ryzen ... All models support: x87, MMX, SSE, SSE2, SSE3, SSSE3, SSE4.1, SSE4.2, AES, CLMUL, AVX, AVX2, FMA, CVT16/F16C, ABM, BMI1, BMI2, SHA.[5] ... * Program to performs the CPUID check: Reference : https://software.intel.com/en-us/articles/intel-sha-extensions ... Availability of the Intel® SHA Extensions on a particular processor can be determined by checking the SHA CPUID bit in CPUID.(EAX=07H, ECX=0):EBX.SHA [bit 29]. The following C function, using inline assembly, performs the CPUID check: -- int CheckForIntelShaExtensions() { int a, b, c, d; // Look for CPUID.7.0.EBX[29] // EAX = 7, ECX = 0 a = 7; c = 0; asm volatile ("cpuid" :"=a"(a), "=b"(b), "=c"(c), "=d"(d) :"a"(a), "c"(c) ); // Intel® SHA Extensions feature bit is EBX[29] return ((b >> 29) & 1); } -- On CPU with sha_ni the program return "1". Otherwise it return "0". [Test Case] * Reproducible with Xenial/Zesty/Artful release. * Generated a checksum of a big file (e.g. 5GB file) with openssl $ time /usr/bin/openssl dgst -sha256 /var/tmp/5Gfile SHA256(/var/tmp/5Gfile)= 8d448d81521cbc1bfdc04dd199d448bd3c49374221007bd0846d8d39a70dd4f8 real 0m12.835s user 0m12.344s sys 0m0.484s * Openssl speed $ openssl speed sha1 Doing sha1 for 3s on 16 size blocks: 9969152 sha1's in 3.00s Doing sha1 for 3s on 64 size blocks: 8019164 sha1's in 3.00s Doing sha1 for 3s on 256 size blocks: 5254219 sha1's in 2.99s Doing sha1 for 3s on 1024 size blocks: 2217067 sha1's in 3.00s Doing sha1 for 3s on 8192 size blocks: 347842 sha1's in 3.00s OpenSSL 1.0.2g 1 Mar 2016 built on: reproducible build, date unspecified options:bn(64,64) rc4(8x,int) des(idx,cisc,16,int) aes(partial) idea(int) blowfish(idx) compiler: gcc -I. -I.. -I../include -DOPENSSL_THREADS -D_REENTRANT -DDSO_DLFCN -DHAVE_DLFCN_H -Wa,--noexecstack -m64 -DL_ENDIAN -O3 -Wall -DOPENSSL_IA32_SSE2 -DOPENSSL_BN_ASM_MONT -DOPENSSL_BN_ASM_MONT5 -DOPENSSL_BN_ASM_GF2m -DSHA1_ASM -DSHA256_ASM -DSHA512_ASM -DMD5_ASM -DAES_ASM -DVPAES_ASM -DBSAES_ASM -DWHIRLPOOL_ASM -DGHASH_ASM -DECP_NISTZ256_ASM The 'numbers' are in 1000s of bytes per second processed. type 16 bytes 64 bytes 256 bytes 1024 bytes 8192 bytes sha1 53168.81k 171075.50k 449859.55k 756758.87k 949840.55 The performance are clearly better when using the patch which take benefit of the sha extension. (See Regression Potential section for result with patch) [Regression Potential] * Note : IRC discussion with infinity : htt
[Touch-packages] [Bug 1674399] Re: OpenSSL CPU detection for AMD Ryzen CPUs
Eric - Have you tested the new version of openssl which William uploaded? -- You received this bug notification because you are a member of Ubuntu Touch seeded packages, which is subscribed to openssl in Ubuntu. https://bugs.launchpad.net/bugs/1674399 Title: OpenSSL CPU detection for AMD Ryzen CPUs Status in openssl package in Ubuntu: Fix Released Status in openssl source package in Xenial: Fix Committed Status in openssl source package in Yakkety: Fix Committed Status in openssl source package in Zesty: Fix Committed Status in openssl source package in Artful: Fix Released Bug description: [Impact] * Context: AMD added support in their processors for SHA Extensions[1] (CPU flag: sha_ni[2]) starting with Ryzen[3] CPU. Note that Ryzen CPU come in 64bit only (Confirmed with AMD representative). Current OpenSSL version in Ryzens still calls SHA for SSSE3 routine as result a number of extensions were effectively masked on Ryzen and shows no improvement. [1] /proc/cpuinfo processor : 0 vendor_id : AuthenticAMD cpu family : 23 model : 1 model name : AMD Ryzen 5 1600 Six-Core Processor flags : fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush mmx fxsr sse sse2 ht syscall nx mmxext fxsr_opt pdpe1gb rdtscp lm constant_tsc rep_good nopl nonstop_tsc extd_apicid aperfmperf eagerfpu pni pclmulqdq monitor ssse3 fma cx16 sse 4_1 sse4_2 movbe popcnt aes xsave avx f16c rdrand lahf_lm cmp_legacy svm extapic cr8_legacy abm sse4a misalignsse 3dnowprefetch osvw skinit wdt tce topoext perfctr_core perfctr_nb bpext perfctr_l2 mwaitx hw_pstate vmmcall fsgsbase bmi1 avx2 smep bmi2 rdseed adx smap clflusho pt sha_ni xsaveopt xsavec xgetbv1 clzero arat npt lbrv svm_lock nrip_save tsc_scale vmcb_clean flushbyasid decodeassists pausefilter pfthreshold [2] - sha_ni: SHA1/SHA256 Instruction Extensions [3] - https://en.wikipedia.org/wiki/Ryzen ... All models support: x87, MMX, SSE, SSE2, SSE3, SSSE3, SSE4.1, SSE4.2, AES, CLMUL, AVX, AVX2, FMA, CVT16/F16C, ABM, BMI1, BMI2, SHA.[5] ... * Program to performs the CPUID check: Reference : https://software.intel.com/en-us/articles/intel-sha-extensions ... Availability of the Intel® SHA Extensions on a particular processor can be determined by checking the SHA CPUID bit in CPUID.(EAX=07H, ECX=0):EBX.SHA [bit 29]. The following C function, using inline assembly, performs the CPUID check: -- int CheckForIntelShaExtensions() { int a, b, c, d; // Look for CPUID.7.0.EBX[29] // EAX = 7, ECX = 0 a = 7; c = 0; asm volatile ("cpuid" :"=a"(a), "=b"(b), "=c"(c), "=d"(d) :"a"(a), "c"(c) ); // Intel® SHA Extensions feature bit is EBX[29] return ((b >> 29) & 1); } -- On CPU with sha_ni the program return "1". Otherwise it return "0". [Test Case] * Reproducible with Xenial/Zesty/Artful release. * Generated a checksum of a big file (e.g. 5GB file) with openssl $ time /usr/bin/openssl dgst -sha256 /var/tmp/5Gfile SHA256(/var/tmp/5Gfile)= 8d448d81521cbc1bfdc04dd199d448bd3c49374221007bd0846d8d39a70dd4f8 real 0m12.835s user 0m12.344s sys 0m0.484s * Openssl speed $ openssl speed sha1 Doing sha1 for 3s on 16 size blocks: 9969152 sha1's in 3.00s Doing sha1 for 3s on 64 size blocks: 8019164 sha1's in 3.00s Doing sha1 for 3s on 256 size blocks: 5254219 sha1's in 2.99s Doing sha1 for 3s on 1024 size blocks: 2217067 sha1's in 3.00s Doing sha1 for 3s on 8192 size blocks: 347842 sha1's in 3.00s OpenSSL 1.0.2g 1 Mar 2016 built on: reproducible build, date unspecified options:bn(64,64) rc4(8x,int) des(idx,cisc,16,int) aes(partial) idea(int) blowfish(idx) compiler: gcc -I. -I.. -I../include -DOPENSSL_THREADS -D_REENTRANT -DDSO_DLFCN -DHAVE_DLFCN_H -Wa,--noexecstack -m64 -DL_ENDIAN -O3 -Wall -DOPENSSL_IA32_SSE2 -DOPENSSL_BN_ASM_MONT -DOPENSSL_BN_ASM_MONT5 -DOPENSSL_BN_ASM_GF2m -DSHA1_ASM -DSHA256_ASM -DSHA512_ASM -DMD5_ASM -DAES_ASM -DVPAES_ASM -DBSAES_ASM -DWHIRLPOOL_ASM -DGHASH_ASM -DECP_NISTZ256_ASM The 'numbers' are in 1000s of bytes per second processed. type 16 bytes 64 bytes 256 bytes 1024 bytes 8192 bytes sha1 53168.81k 171075.50k 449859.55k 756758.87k 949840.55 The performance are clearly better when using the patch which take benefit of the sha extension. (See Regression Potential section for result with patch) [Regression Potential] * Note : IRC discussion with infinity : https://bugs.launchpad.net/ubuntu/xenial/+source/openssl/+bug/1674399/comments/8 * Note from irc discussion with apw and rbasak : https://bugs.launchpad.net/ubuntu/+source/openssl/+bug/1674399/comments/2 * It basically allow openssl to take benefit of sha extension potential (mostly performance-wise) now that new AMD cpu starting to have the capability. * The code check the CPUID bit to determine if the sha instructions are available are not. * Ma
[Touch-packages] [Bug 1674399] Re: OpenSSL CPU detection for AMD Ryzen CPUs
William - could you add some information about the verification you did since you tagged this verification-done? ** Tags removed: verification-done ** Tags added: verification-needed -- You received this bug notification because you are a member of Ubuntu Touch seeded packages, which is subscribed to openssl in Ubuntu. https://bugs.launchpad.net/bugs/1674399 Title: OpenSSL CPU detection for AMD Ryzen CPUs Status in openssl package in Ubuntu: Fix Released Status in openssl source package in Xenial: Fix Committed Status in openssl source package in Yakkety: Fix Committed Status in openssl source package in Zesty: Fix Committed Status in openssl source package in Artful: Fix Released Bug description: [Impact] * Context: AMD added support in their processors for SHA Extensions[1] (CPU flag: sha_ni[2]) starting with Ryzen[3] CPU. Note that Ryzen CPU come in 64bit only (Confirmed with AMD representative). Current OpenSSL version in Ryzens still calls SHA for SSSE3 routine as result a number of extensions were effectively masked on Ryzen and shows no improvement. [1] /proc/cpuinfo processor : 0 vendor_id : AuthenticAMD cpu family : 23 model : 1 model name : AMD Ryzen 5 1600 Six-Core Processor flags : fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush mmx fxsr sse sse2 ht syscall nx mmxext fxsr_opt pdpe1gb rdtscp lm constant_tsc rep_good nopl nonstop_tsc extd_apicid aperfmperf eagerfpu pni pclmulqdq monitor ssse3 fma cx16 sse 4_1 sse4_2 movbe popcnt aes xsave avx f16c rdrand lahf_lm cmp_legacy svm extapic cr8_legacy abm sse4a misalignsse 3dnowprefetch osvw skinit wdt tce topoext perfctr_core perfctr_nb bpext perfctr_l2 mwaitx hw_pstate vmmcall fsgsbase bmi1 avx2 smep bmi2 rdseed adx smap clflusho pt sha_ni xsaveopt xsavec xgetbv1 clzero arat npt lbrv svm_lock nrip_save tsc_scale vmcb_clean flushbyasid decodeassists pausefilter pfthreshold [2] - sha_ni: SHA1/SHA256 Instruction Extensions [3] - https://en.wikipedia.org/wiki/Ryzen ... All models support: x87, MMX, SSE, SSE2, SSE3, SSSE3, SSE4.1, SSE4.2, AES, CLMUL, AVX, AVX2, FMA, CVT16/F16C, ABM, BMI1, BMI2, SHA.[5] ... * Program to performs the CPUID check: Reference : https://software.intel.com/en-us/articles/intel-sha-extensions ... Availability of the Intel® SHA Extensions on a particular processor can be determined by checking the SHA CPUID bit in CPUID.(EAX=07H, ECX=0):EBX.SHA [bit 29]. The following C function, using inline assembly, performs the CPUID check: -- int CheckForIntelShaExtensions() { int a, b, c, d; // Look for CPUID.7.0.EBX[29] // EAX = 7, ECX = 0 a = 7; c = 0; asm volatile ("cpuid" :"=a"(a), "=b"(b), "=c"(c), "=d"(d) :"a"(a), "c"(c) ); // Intel® SHA Extensions feature bit is EBX[29] return ((b >> 29) & 1); } -- On CPU with sha_ni the program return "1". Otherwise it return "0". [Test Case] * Reproducible with Xenial/Zesty/Artful release. * Generated a checksum of a big file (e.g. 5GB file) with openssl $ time /usr/bin/openssl dgst -sha256 /var/tmp/5Gfile SHA256(/var/tmp/5Gfile)= 8d448d81521cbc1bfdc04dd199d448bd3c49374221007bd0846d8d39a70dd4f8 real 0m12.835s user 0m12.344s sys 0m0.484s * Openssl speed $ openssl speed sha1 Doing sha1 for 3s on 16 size blocks: 9969152 sha1's in 3.00s Doing sha1 for 3s on 64 size blocks: 8019164 sha1's in 3.00s Doing sha1 for 3s on 256 size blocks: 5254219 sha1's in 2.99s Doing sha1 for 3s on 1024 size blocks: 2217067 sha1's in 3.00s Doing sha1 for 3s on 8192 size blocks: 347842 sha1's in 3.00s OpenSSL 1.0.2g 1 Mar 2016 built on: reproducible build, date unspecified options:bn(64,64) rc4(8x,int) des(idx,cisc,16,int) aes(partial) idea(int) blowfish(idx) compiler: gcc -I. -I.. -I../include -DOPENSSL_THREADS -D_REENTRANT -DDSO_DLFCN -DHAVE_DLFCN_H -Wa,--noexecstack -m64 -DL_ENDIAN -O3 -Wall -DOPENSSL_IA32_SSE2 -DOPENSSL_BN_ASM_MONT -DOPENSSL_BN_ASM_MONT5 -DOPENSSL_BN_ASM_GF2m -DSHA1_ASM -DSHA256_ASM -DSHA512_ASM -DMD5_ASM -DAES_ASM -DVPAES_ASM -DBSAES_ASM -DWHIRLPOOL_ASM -DGHASH_ASM -DECP_NISTZ256_ASM The 'numbers' are in 1000s of bytes per second processed. type 16 bytes 64 bytes 256 bytes 1024 bytes 8192 bytes sha1 53168.81k 171075.50k 449859.55k 756758.87k 949840.55 The performance are clearly better when using the patch which take benefit of the sha extension. (See Regression Potential section for result with patch) [Regression Potential] * Note : IRC discussion with infinity : https://bugs.launchpad.net/ubuntu/xenial/+source/openssl/+bug/1674399/comments/8 * Note from irc discussion with apw and rbasak : https://bugs.launchpad.net/ubuntu/+source/openssl/+bug/1674399/comments/2 * It basically allow openssl to take benefit of sha extension potential (mostly performance-wise) now that new AMD cpu starting to have the capab
[Touch-packages] [Bug 1674399] Re: OpenSSL CPU detection for AMD Ryzen CPUs
** Description changed: [Impact] * Context: AMD added support in their processors for SHA Extensions[1] (CPU flag: sha_ni[2]) starting with Ryzen[3] CPU. Note that Ryzen CPU come in 64bit only (Confirmed with AMD representative). Current OpenSSL version in Ryzens still calls SHA for SSSE3 routine as result a number of extensions were effectively masked on Ryzen and shows no improvement. [1] /proc/cpuinfo processor : 0 vendor_id : AuthenticAMD cpu family : 23 model : 1 model name : AMD Ryzen 5 1600 Six-Core Processor flags : fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush mmx fxsr sse sse2 ht syscall nx mmxext fxsr_opt pdpe1gb rdtscp lm constant_tsc rep_good nopl nonstop_tsc extd_apicid aperfmperf eagerfpu pni pclmulqdq monitor ssse3 fma cx16 sse 4_1 sse4_2 movbe popcnt aes xsave avx f16c rdrand lahf_lm cmp_legacy svm extapic cr8_legacy abm sse4a misalignsse 3dnowprefetch osvw skinit wdt tce topoext perfctr_core perfctr_nb bpext perfctr_l2 mwaitx hw_pstate vmmcall fsgsbase bmi1 avx2 smep bmi2 rdseed adx smap clflusho pt sha_ni xsaveopt xsavec xgetbv1 clzero arat npt lbrv svm_lock nrip_save tsc_scale vmcb_clean flushbyasid decodeassists pausefilter pfthreshold [2] - sha_ni: SHA1/SHA256 Instruction Extensions [3] - https://en.wikipedia.org/wiki/Ryzen ... All models support: x87, MMX, SSE, SSE2, SSE3, SSSE3, SSE4.1, SSE4.2, AES, CLMUL, AVX, AVX2, FMA, CVT16/F16C, ABM, BMI1, BMI2, SHA.[5] ... * Program to performs the CPUID check: Reference : https://software.intel.com/en-us/articles/intel-sha-extensions ... Availability of the Intel® SHA Extensions on a particular processor can be determined by checking the SHA CPUID bit in CPUID.(EAX=07H, ECX=0):EBX.SHA [bit 29]. The following C function, using inline assembly, performs the CPUID check: -- int CheckForIntelShaExtensions() { int a, b, c, d; // Look for CPUID.7.0.EBX[29] // EAX = 7, ECX = 0 a = 7; c = 0; asm volatile ("cpuid" :"=a"(a), "=b"(b), "=c"(c), "=d"(d) :"a"(a), "c"(c) ); // Intel® SHA Extensions feature bit is EBX[29] return ((b >> 29) & 1); } -- On CPU with sha_ni the program return "1". Otherwise it return "0". [Test Case] * Reproducible with Xenial/Zesty/Artful release. * Generated a checksum of a big file (e.g. 5GB file) with openssl $ time /usr/bin/openssl dgst -sha256 /var/tmp/5Gfile SHA256(/var/tmp/5Gfile)= 8d448d81521cbc1bfdc04dd199d448bd3c49374221007bd0846d8d39a70dd4f8 real 0m12.835s user 0m12.344s sys 0m0.484s * Openssl speed $ openssl speed sha1 Doing sha1 for 3s on 16 size blocks: 9969152 sha1's in 3.00s Doing sha1 for 3s on 64 size blocks: 8019164 sha1's in 3.00s Doing sha1 for 3s on 256 size blocks: 5254219 sha1's in 2.99s Doing sha1 for 3s on 1024 size blocks: 2217067 sha1's in 3.00s Doing sha1 for 3s on 8192 size blocks: 347842 sha1's in 3.00s OpenSSL 1.0.2g 1 Mar 2016 built on: reproducible build, date unspecified options:bn(64,64) rc4(8x,int) des(idx,cisc,16,int) aes(partial) idea(int) blowfish(idx) compiler: gcc -I. -I.. -I../include -DOPENSSL_THREADS -D_REENTRANT -DDSO_DLFCN -DHAVE_DLFCN_H -Wa,--noexecstack -m64 -DL_ENDIAN -O3 -Wall -DOPENSSL_IA32_SSE2 -DOPENSSL_BN_ASM_MONT -DOPENSSL_BN_ASM_MONT5 -DOPENSSL_BN_ASM_GF2m -DSHA1_ASM -DSHA256_ASM -DSHA512_ASM -DMD5_ASM -DAES_ASM -DVPAES_ASM -DBSAES_ASM -DWHIRLPOOL_ASM -DGHASH_ASM -DECP_NISTZ256_ASM The 'numbers' are in 1000s of bytes per second processed. type 16 bytes 64 bytes 256 bytes 1024 bytes 8192 bytes sha1 53168.81k 171075.50k 449859.55k 756758.87k 949840.55 The performance are clearly better when using the patch which take benefit of the sha extension. (See Regression Potential section for result with patch) [Regression Potential] * Note : IRC discussion with infinity : https://bugs.launchpad.net/ubuntu/xenial/+source/openssl/+bug/1674399/comments/8 * Note from irc discussion with apw and rbasak : https://bugs.launchpad.net/ubuntu/+source/openssl/+bug/1674399/comments/2 * It basically allow openssl to take benefit of sha extension potential (mostly performance-wise) now that new AMD cpu starting to have the capability. * The code check the CPUID bit to determine if the sha instructions are available are not. * Maintainer comment proves that he did the successfully tested on Intel with/without SHA extension Reference: https://github.com/openssl/openssl/issues/2848 "I don't have access to Ryzen system, so I didn't test it explicitly on Ryzen. Reporter did confirm it tough. Myself I tested on Intel processors, yes, with/without." * LP reporter comment : I, slashd, have tested on a Ryzen system (and AMD non-ryzen) and non-sha INTEL cpu. It does reveal a significant performance increase on Ryzen due to the sha extension : (Note
[Touch-packages] [Bug 1674399] Re: OpenSSL CPU detection for AMD Ryzen CPUs
** Description changed: [Impact] * Context: AMD added support in their processors for SHA Extensions[1] (CPU flag: sha_ni[2]) starting with Ryzen[3] CPU. Note that Ryzen CPU come in 64bit only (Confirmed with AMD representative). Current OpenSSL version in Ryzens still calls SHA for SSSE3 routine as result a number of extensions were effectively masked on Ryzen and shows no improvement. [1] /proc/cpuinfo processor : 0 vendor_id : AuthenticAMD cpu family : 23 model : 1 model name : AMD Ryzen 5 1600 Six-Core Processor flags : fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush mmx fxsr sse sse2 ht syscall nx mmxext fxsr_opt pdpe1gb rdtscp lm constant_tsc rep_good nopl nonstop_tsc extd_apicid aperfmperf eagerfpu pni pclmulqdq monitor ssse3 fma cx16 sse 4_1 sse4_2 movbe popcnt aes xsave avx f16c rdrand lahf_lm cmp_legacy svm extapic cr8_legacy abm sse4a misalignsse 3dnowprefetch osvw skinit wdt tce topoext perfctr_core perfctr_nb bpext perfctr_l2 mwaitx hw_pstate vmmcall fsgsbase bmi1 avx2 smep bmi2 rdseed adx smap clflusho pt sha_ni xsaveopt xsavec xgetbv1 clzero arat npt lbrv svm_lock nrip_save tsc_scale vmcb_clean flushbyasid decodeassists pausefilter pfthreshold [2] - sha_ni: SHA1/SHA256 Instruction Extensions [3] - https://en.wikipedia.org/wiki/Ryzen ... All models support: x87, MMX, SSE, SSE2, SSE3, SSSE3, SSE4.1, SSE4.2, AES, CLMUL, AVX, AVX2, FMA, CVT16/F16C, ABM, BMI1, BMI2, SHA.[5] ... * Program to performs the CPUID check: Reference : https://software.intel.com/en-us/articles/intel-sha-extensions ... Availability of the Intel® SHA Extensions on a particular processor can be determined by checking the SHA CPUID bit in CPUID.(EAX=07H, ECX=0):EBX.SHA [bit 29]. The following C function, using inline assembly, performs the CPUID check: -- int CheckForIntelShaExtensions() { int a, b, c, d; // Look for CPUID.7.0.EBX[29] // EAX = 7, ECX = 0 a = 7; c = 0; asm volatile ("cpuid" :"=a"(a), "=b"(b), "=c"(c), "=d"(d) :"a"(a), "c"(c) ); // Intel® SHA Extensions feature bit is EBX[29] return ((b >> 29) & 1); } -- On CPU with sha_ni the program return "1". Otherwise it return "0". [Test Case] * Reproducible with Xenial/Zesty/Artful release. * Generated a checksum of a big file (e.g. 5GB file) with openssl $ time /usr/bin/openssl dgst -sha256 /var/tmp/5Gfile SHA256(/var/tmp/5Gfile)= 8d448d81521cbc1bfdc04dd199d448bd3c49374221007bd0846d8d39a70dd4f8 real 0m12.835s user 0m12.344s sys 0m0.484s * Openssl speed $ openssl speed sha1 Doing sha1 for 3s on 16 size blocks: 9969152 sha1's in 3.00s Doing sha1 for 3s on 64 size blocks: 8019164 sha1's in 3.00s Doing sha1 for 3s on 256 size blocks: 5254219 sha1's in 2.99s Doing sha1 for 3s on 1024 size blocks: 2217067 sha1's in 3.00s Doing sha1 for 3s on 8192 size blocks: 347842 sha1's in 3.00s OpenSSL 1.0.2g 1 Mar 2016 built on: reproducible build, date unspecified options:bn(64,64) rc4(8x,int) des(idx,cisc,16,int) aes(partial) idea(int) blowfish(idx) compiler: gcc -I. -I.. -I../include -DOPENSSL_THREADS -D_REENTRANT -DDSO_DLFCN -DHAVE_DLFCN_H -Wa,--noexecstack -m64 -DL_ENDIAN -O3 -Wall -DOPENSSL_IA32_SSE2 -DOPENSSL_BN_ASM_MONT -DOPENSSL_BN_ASM_MONT5 -DOPENSSL_BN_ASM_GF2m -DSHA1_ASM -DSHA256_ASM -DSHA512_ASM -DMD5_ASM -DAES_ASM -DVPAES_ASM -DBSAES_ASM -DWHIRLPOOL_ASM -DGHASH_ASM -DECP_NISTZ256_ASM The 'numbers' are in 1000s of bytes per second processed. type 16 bytes 64 bytes 256 bytes 1024 bytes 8192 bytes sha1 53168.81k 171075.50k 449859.55k 756758.87k 949840.55 The performance are clearly better when using the patch which take benefit of the sha extension. (See Regression Potential section for result with patch) [Regression Potential] * Note : IRC discussion with infinity : https://bugs.launchpad.net/ubuntu/xenial/+source/openssl/+bug/1674399/comments/8 * Note from irc discussion with apw and rbasak : https://bugs.launchpad.net/ubuntu/+source/openssl/+bug/1674399/comments/2 * It basically allow openssl to take benefit of sha extension potential (mostly performance-wise) now that new AMD cpu starting to have the capability. * The code check the CPUID bit to determine if the sha instructions are available are not. * Maintainer comment proves that he did the successfully tested on Intel with/without SHA extension Reference: https://github.com/openssl/openssl/issues/2848 "I don't have access to Ryzen system, so I didn't test it explicitly on Ryzen. Reporter did confirm it tough. Myself I tested on Intel processors, yes, with/without." * LP reporter comment : I, slashd, have tested on a Ryzen system (and AMD non-ryzen) and non-sha INTEL cpu. It does reveal a significant performance increase on Ryzen due to the sha extension : (Note
[Touch-packages] [Bug 1674399] Re: OpenSSL CPU detection for AMD Ryzen CPUs
** Tags removed: verification-failed verification-needed ** Tags added: verification-done -- You received this bug notification because you are a member of Ubuntu Touch seeded packages, which is subscribed to openssl in Ubuntu. https://bugs.launchpad.net/bugs/1674399 Title: OpenSSL CPU detection for AMD Ryzen CPUs Status in openssl package in Ubuntu: Fix Released Status in openssl source package in Xenial: Fix Committed Status in openssl source package in Yakkety: Fix Committed Status in openssl source package in Zesty: Fix Committed Status in openssl source package in Artful: Fix Released Bug description: [Impact] * Context: AMD added support in their processors for SHA Extensions[1] (CPU flag: sha_ni[2]) starting with Ryzen[3] CPU. Note that Ryzen CPU come in 64bit only (Confirmed with AMD representative). Current OpenSSL version in Ryzens still calls SHA for SSSE3 routine as result a number of extensions were effectively masked on Ryzen and shows no improvement. [1] /proc/cpuinfo processor : 0 vendor_id : AuthenticAMD cpu family : 23 model : 1 model name : AMD Ryzen 5 1600 Six-Core Processor flags : fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush mmx fxsr sse sse2 ht syscall nx mmxext fxsr_opt pdpe1gb rdtscp lm constant_tsc rep_good nopl nonstop_tsc extd_apicid aperfmperf eagerfpu pni pclmulqdq monitor ssse3 fma cx16 sse 4_1 sse4_2 movbe popcnt aes xsave avx f16c rdrand lahf_lm cmp_legacy svm extapic cr8_legacy abm sse4a misalignsse 3dnowprefetch osvw skinit wdt tce topoext perfctr_core perfctr_nb bpext perfctr_l2 mwaitx hw_pstate vmmcall fsgsbase bmi1 avx2 smep bmi2 rdseed adx smap clflusho pt sha_ni xsaveopt xsavec xgetbv1 clzero arat npt lbrv svm_lock nrip_save tsc_scale vmcb_clean flushbyasid decodeassists pausefilter pfthreshold [2] - sha_ni: SHA1/SHA256 Instruction Extensions [3] - https://en.wikipedia.org/wiki/Ryzen ... All models support: x87, MMX, SSE, SSE2, SSE3, SSSE3, SSE4.1, SSE4.2, AES, CLMUL, AVX, AVX2, FMA, CVT16/F16C, ABM, BMI1, BMI2, SHA.[5] ... * Program to performs the CPUID check: Reference : https://software.intel.com/en-us/articles/intel-sha-extensions ... Availability of the Intel® SHA Extensions on a particular processor can be determined by checking the SHA CPUID bit in CPUID.(EAX=07H, ECX=0):EBX.SHA [bit 29]. The following C function, using inline assembly, performs the CPUID check: -- int CheckForIntelShaExtensions() { int a, b, c, d; // Look for CPUID.7.0.EBX[29] // EAX = 7, ECX = 0 a = 7; c = 0; asm volatile ("cpuid" :"=a"(a), "=b"(b), "=c"(c), "=d"(d) :"a"(a), "c"(c) ); // Intel® SHA Extensions feature bit is EBX[29] return ((b >> 29) & 1); } -- On CPU with sha_ni the program return "1". Otherwise it return "0". [Test Case] * Reproducible with Xenial/Zesty/Artful release. * Generated a checksum of a big file (e.g. 5GB file) with openssl $ time /usr/bin/openssl dgst -sha256 /var/tmp/5Gfile SHA256(/var/tmp/5Gfile)= 8d448d81521cbc1bfdc04dd199d448bd3c49374221007bd0846d8d39a70dd4f8 real 0m12.835s user 0m12.344s sys 0m0.484s * Openssl speed $ openssl speed sha1 Doing sha1 for 3s on 16 size blocks: 9969152 sha1's in 3.00s Doing sha1 for 3s on 64 size blocks: 8019164 sha1's in 3.00s Doing sha1 for 3s on 256 size blocks: 5254219 sha1's in 2.99s Doing sha1 for 3s on 1024 size blocks: 2217067 sha1's in 3.00s Doing sha1 for 3s on 8192 size blocks: 347842 sha1's in 3.00s OpenSSL 1.0.2g 1 Mar 2016 built on: reproducible build, date unspecified options:bn(64,64) rc4(8x,int) des(idx,cisc,16,int) aes(partial) idea(int) blowfish(idx) compiler: gcc -I. -I.. -I../include -DOPENSSL_THREADS -D_REENTRANT -DDSO_DLFCN -DHAVE_DLFCN_H -Wa,--noexecstack -m64 -DL_ENDIAN -O3 -Wall -DOPENSSL_IA32_SSE2 -DOPENSSL_BN_ASM_MONT -DOPENSSL_BN_ASM_MONT5 -DOPENSSL_BN_ASM_GF2m -DSHA1_ASM -DSHA256_ASM -DSHA512_ASM -DMD5_ASM -DAES_ASM -DVPAES_ASM -DBSAES_ASM -DWHIRLPOOL_ASM -DGHASH_ASM -DECP_NISTZ256_ASM The 'numbers' are in 1000s of bytes per second processed. type 16 bytes 64 bytes 256 bytes 1024 bytes 8192 bytes sha1 53168.81k 171075.50k 449859.55k 756758.87k 949840.55 The performance are clearly better when using the patch which take benefit of the sha extension. (See Regression Potential section for result with patch) [Regression Potential] * Note : IRC discussion with infinity : https://bugs.launchpad.net/ubuntu/xenial/+source/openssl/+bug/1674399/comments/8 * Note from irc discussion with apw and rbasak : https://bugs.launchpad.net/ubuntu/+source/openssl/+bug/1674399/comments/2 * It basically allow openssl to take benefit of sha extension potential (mostly performance-wise) now that new AMD cpu starting to have the capability. * The code check the CPUID bit to determine if the sha instructions are available
[Touch-packages] [Bug 1674399] Re: OpenSSL CPU detection for AMD Ryzen CPUs
** Tags removed: verification-failed -- You received this bug notification because you are a member of Ubuntu Touch seeded packages, which is subscribed to openssl in Ubuntu. https://bugs.launchpad.net/bugs/1674399 Title: OpenSSL CPU detection for AMD Ryzen CPUs Status in openssl package in Ubuntu: Fix Released Status in openssl source package in Xenial: Fix Committed Status in openssl source package in Yakkety: Fix Committed Status in openssl source package in Zesty: Fix Committed Status in openssl source package in Artful: Fix Released Bug description: [Impact] * Context: AMD added support in their processors for SHA Extensions[1] (CPU flag: sha_ni[2]) starting with Ryzen[3] CPU. Note that Ryzen CPU come in 64bit only (Confirmed with AMD representative). Current OpenSSL version in Ryzens still calls SHA for SSSE3 routine as result a number of extensions were effectively masked on Ryzen and shows no improvement. [1] /proc/cpuinfo processor : 0 vendor_id : AuthenticAMD cpu family : 23 model : 1 model name : AMD Ryzen 5 1600 Six-Core Processor flags : fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush mmx fxsr sse sse2 ht syscall nx mmxext fxsr_opt pdpe1gb rdtscp lm constant_tsc rep_good nopl nonstop_tsc extd_apicid aperfmperf eagerfpu pni pclmulqdq monitor ssse3 fma cx16 sse 4_1 sse4_2 movbe popcnt aes xsave avx f16c rdrand lahf_lm cmp_legacy svm extapic cr8_legacy abm sse4a misalignsse 3dnowprefetch osvw skinit wdt tce topoext perfctr_core perfctr_nb bpext perfctr_l2 mwaitx hw_pstate vmmcall fsgsbase bmi1 avx2 smep bmi2 rdseed adx smap clflusho pt sha_ni xsaveopt xsavec xgetbv1 clzero arat npt lbrv svm_lock nrip_save tsc_scale vmcb_clean flushbyasid decodeassists pausefilter pfthreshold [2] - sha_ni: SHA1/SHA256 Instruction Extensions [3] - https://en.wikipedia.org/wiki/Ryzen ... All models support: x87, MMX, SSE, SSE2, SSE3, SSSE3, SSE4.1, SSE4.2, AES, CLMUL, AVX, AVX2, FMA, CVT16/F16C, ABM, BMI1, BMI2, SHA.[5] ... * Program to performs the CPUID check: Reference : https://software.intel.com/en-us/articles/intel-sha-extensions ... Availability of the Intel® SHA Extensions on a particular processor can be determined by checking the SHA CPUID bit in CPUID.(EAX=07H, ECX=0):EBX.SHA [bit 29]. The following C function, using inline assembly, performs the CPUID check: -- int CheckForIntelShaExtensions() { int a, b, c, d; // Look for CPUID.7.0.EBX[29] // EAX = 7, ECX = 0 a = 7; c = 0; asm volatile ("cpuid" :"=a"(a), "=b"(b), "=c"(c), "=d"(d) :"a"(a), "c"(c) ); // Intel® SHA Extensions feature bit is EBX[29] return ((b >> 29) & 1); } -- On CPU with sha_ni the program return "1". Otherwise it return "0". [Test Case] * Reproducible with Xenial/Zesty/Artful release. * Generated a checksum of a big file (e.g. 5GB file) with openssl $ time /usr/bin/openssl dgst -sha256 /var/tmp/5Gfile SHA256(/var/tmp/5Gfile)= 8d448d81521cbc1bfdc04dd199d448bd3c49374221007bd0846d8d39a70dd4f8 real 0m12.835s user 0m12.344s sys 0m0.484s * Openssl speed $ openssl speed sha1 Doing sha1 for 3s on 16 size blocks: 9969152 sha1's in 3.00s Doing sha1 for 3s on 64 size blocks: 8019164 sha1's in 3.00s Doing sha1 for 3s on 256 size blocks: 5254219 sha1's in 2.99s Doing sha1 for 3s on 1024 size blocks: 2217067 sha1's in 3.00s Doing sha1 for 3s on 8192 size blocks: 347842 sha1's in 3.00s OpenSSL 1.0.2g 1 Mar 2016 built on: reproducible build, date unspecified options:bn(64,64) rc4(8x,int) des(idx,cisc,16,int) aes(partial) idea(int) blowfish(idx) compiler: gcc -I. -I.. -I../include -DOPENSSL_THREADS -D_REENTRANT -DDSO_DLFCN -DHAVE_DLFCN_H -Wa,--noexecstack -m64 -DL_ENDIAN -O3 -Wall -DOPENSSL_IA32_SSE2 -DOPENSSL_BN_ASM_MONT -DOPENSSL_BN_ASM_MONT5 -DOPENSSL_BN_ASM_GF2m -DSHA1_ASM -DSHA256_ASM -DSHA512_ASM -DMD5_ASM -DAES_ASM -DVPAES_ASM -DBSAES_ASM -DWHIRLPOOL_ASM -DGHASH_ASM -DECP_NISTZ256_ASM The 'numbers' are in 1000s of bytes per second processed. type 16 bytes 64 bytes 256 bytes 1024 bytes 8192 bytes sha1 53168.81k 171075.50k 449859.55k 756758.87k 949840.55 The performance are clearly better when using the patch which take benefit of the sha extension. (See Regression Potential section for result with patch) [Regression Potential] * Note : IRC discussion with infinity : https://bugs.launchpad.net/ubuntu/xenial/+source/openssl/+bug/1674399/comments/8 * Note from irc discussion with apw and rbasak : https://bugs.launchpad.net/ubuntu/+source/openssl/+bug/1674399/comments/2 * It basically allow openssl to take benefit of sha extension potential (mostly performance-wise) now that new AMD cpu starting to have the capability. * The code check the CPUID bit to determine if the sha instructions are available are not. * Maintainer comment proves that he did t
[Touch-packages] [Bug 1674399] Re: OpenSSL CPU detection for AMD Ryzen CPUs
** Tags removed: verification-failed ** Tags removed: patch ** Tags removed: ua -- You received this bug notification because you are a member of Ubuntu Touch seeded packages, which is subscribed to openssl in Ubuntu. https://bugs.launchpad.net/bugs/1674399 Title: OpenSSL CPU detection for AMD Ryzen CPUs Status in openssl package in Ubuntu: Fix Released Status in openssl source package in Xenial: Fix Committed Status in openssl source package in Yakkety: Fix Committed Status in openssl source package in Zesty: Fix Committed Status in openssl source package in Artful: Fix Released Bug description: [Impact] * Context: AMD added support in their processors for SHA Extensions[1] (CPU flag: sha_ni[2]) starting with Ryzen[3] CPU. Note that Ryzen CPU come in 64bit only (Confirmed with AMD representative). Current OpenSSL version in Ryzens still calls SHA for SSSE3 routine as result a number of extensions were effectively masked on Ryzen and shows no improvement. [1] /proc/cpuinfo processor : 0 vendor_id : AuthenticAMD cpu family : 23 model : 1 model name : AMD Ryzen 5 1600 Six-Core Processor flags : fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush mmx fxsr sse sse2 ht syscall nx mmxext fxsr_opt pdpe1gb rdtscp lm constant_tsc rep_good nopl nonstop_tsc extd_apicid aperfmperf eagerfpu pni pclmulqdq monitor ssse3 fma cx16 sse 4_1 sse4_2 movbe popcnt aes xsave avx f16c rdrand lahf_lm cmp_legacy svm extapic cr8_legacy abm sse4a misalignsse 3dnowprefetch osvw skinit wdt tce topoext perfctr_core perfctr_nb bpext perfctr_l2 mwaitx hw_pstate vmmcall fsgsbase bmi1 avx2 smep bmi2 rdseed adx smap clflusho pt sha_ni xsaveopt xsavec xgetbv1 clzero arat npt lbrv svm_lock nrip_save tsc_scale vmcb_clean flushbyasid decodeassists pausefilter pfthreshold [2] - sha_ni: SHA1/SHA256 Instruction Extensions [3] - https://en.wikipedia.org/wiki/Ryzen ... All models support: x87, MMX, SSE, SSE2, SSE3, SSSE3, SSE4.1, SSE4.2, AES, CLMUL, AVX, AVX2, FMA, CVT16/F16C, ABM, BMI1, BMI2, SHA.[5] ... * Program to performs the CPUID check: Reference : https://software.intel.com/en-us/articles/intel-sha-extensions ... Availability of the Intel® SHA Extensions on a particular processor can be determined by checking the SHA CPUID bit in CPUID.(EAX=07H, ECX=0):EBX.SHA [bit 29]. The following C function, using inline assembly, performs the CPUID check: -- int CheckForIntelShaExtensions() { int a, b, c, d; // Look for CPUID.7.0.EBX[29] // EAX = 7, ECX = 0 a = 7; c = 0; asm volatile ("cpuid" :"=a"(a), "=b"(b), "=c"(c), "=d"(d) :"a"(a), "c"(c) ); // Intel® SHA Extensions feature bit is EBX[29] return ((b >> 29) & 1); } -- On CPU with sha_ni the program return "1". Otherwise it return "0". [Test Case] * Reproducible with Xenial/Zesty/Artful release. * Generated a checksum of a big file (e.g. 5GB file) with openssl $ time /usr/bin/openssl dgst -sha256 /var/tmp/5Gfile SHA256(/var/tmp/5Gfile)= 8d448d81521cbc1bfdc04dd199d448bd3c49374221007bd0846d8d39a70dd4f8 real 0m12.835s user 0m12.344s sys 0m0.484s * Openssl speed $ openssl speed sha1 Doing sha1 for 3s on 16 size blocks: 9969152 sha1's in 3.00s Doing sha1 for 3s on 64 size blocks: 8019164 sha1's in 3.00s Doing sha1 for 3s on 256 size blocks: 5254219 sha1's in 2.99s Doing sha1 for 3s on 1024 size blocks: 2217067 sha1's in 3.00s Doing sha1 for 3s on 8192 size blocks: 347842 sha1's in 3.00s OpenSSL 1.0.2g 1 Mar 2016 built on: reproducible build, date unspecified options:bn(64,64) rc4(8x,int) des(idx,cisc,16,int) aes(partial) idea(int) blowfish(idx) compiler: gcc -I. -I.. -I../include -DOPENSSL_THREADS -D_REENTRANT -DDSO_DLFCN -DHAVE_DLFCN_H -Wa,--noexecstack -m64 -DL_ENDIAN -O3 -Wall -DOPENSSL_IA32_SSE2 -DOPENSSL_BN_ASM_MONT -DOPENSSL_BN_ASM_MONT5 -DOPENSSL_BN_ASM_GF2m -DSHA1_ASM -DSHA256_ASM -DSHA512_ASM -DMD5_ASM -DAES_ASM -DVPAES_ASM -DBSAES_ASM -DWHIRLPOOL_ASM -DGHASH_ASM -DECP_NISTZ256_ASM The 'numbers' are in 1000s of bytes per second processed. type 16 bytes 64 bytes 256 bytes 1024 bytes 8192 bytes sha1 53168.81k 171075.50k 449859.55k 756758.87k 949840.55 The performance are clearly better when using the patch which take benefit of the sha extension. (See Regression Potential section for result with patch) [Regression Potential] * Note : IRC discussion with infinity : https://bugs.launchpad.net/ubuntu/xenial/+source/openssl/+bug/1674399/comments/8 * Note from irc discussion with apw and rbasak : https://bugs.launchpad.net/ubuntu/+source/openssl/+bug/1674399/comments/2 * It basically allow openssl to take benefit of sha extension potential (mostly performance-wise) now that new AMD cpu starting to have the capability. * The code check the CPUID bit to determine if the sha instructions are available are not.
[Touch-packages] [Bug 1674399] Re: OpenSSL CPU detection for AMD Ryzen CPUs
Removing the "regression-proposed" & "verification-failed" tags now that the regression fix ins now available to testing in -proposed. ** Tags removed: regression-proposed verification-failed -- You received this bug notification because you are a member of Ubuntu Touch seeded packages, which is subscribed to openssl in Ubuntu. https://bugs.launchpad.net/bugs/1674399 Title: OpenSSL CPU detection for AMD Ryzen CPUs Status in openssl package in Ubuntu: Fix Released Status in openssl source package in Xenial: Fix Committed Status in openssl source package in Yakkety: Fix Committed Status in openssl source package in Zesty: Fix Committed Status in openssl source package in Artful: Fix Released Bug description: [Impact] * Context: AMD added support in their processors for SHA Extensions[1] (CPU flag: sha_ni[2]) starting with Ryzen[3] CPU. Note that Ryzen CPU come in 64bit only (Confirmed with AMD representative). Current OpenSSL version in Ryzens still calls SHA for SSSE3 routine as result a number of extensions were effectively masked on Ryzen and shows no improvement. [1] /proc/cpuinfo processor : 0 vendor_id : AuthenticAMD cpu family : 23 model : 1 model name : AMD Ryzen 5 1600 Six-Core Processor flags : fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush mmx fxsr sse sse2 ht syscall nx mmxext fxsr_opt pdpe1gb rdtscp lm constant_tsc rep_good nopl nonstop_tsc extd_apicid aperfmperf eagerfpu pni pclmulqdq monitor ssse3 fma cx16 sse 4_1 sse4_2 movbe popcnt aes xsave avx f16c rdrand lahf_lm cmp_legacy svm extapic cr8_legacy abm sse4a misalignsse 3dnowprefetch osvw skinit wdt tce topoext perfctr_core perfctr_nb bpext perfctr_l2 mwaitx hw_pstate vmmcall fsgsbase bmi1 avx2 smep bmi2 rdseed adx smap clflusho pt sha_ni xsaveopt xsavec xgetbv1 clzero arat npt lbrv svm_lock nrip_save tsc_scale vmcb_clean flushbyasid decodeassists pausefilter pfthreshold [2] - sha_ni: SHA1/SHA256 Instruction Extensions [3] - https://en.wikipedia.org/wiki/Ryzen ... All models support: x87, MMX, SSE, SSE2, SSE3, SSSE3, SSE4.1, SSE4.2, AES, CLMUL, AVX, AVX2, FMA, CVT16/F16C, ABM, BMI1, BMI2, SHA.[5] ... * Program to performs the CPUID check: Reference : https://software.intel.com/en-us/articles/intel-sha-extensions ... Availability of the Intel® SHA Extensions on a particular processor can be determined by checking the SHA CPUID bit in CPUID.(EAX=07H, ECX=0):EBX.SHA [bit 29]. The following C function, using inline assembly, performs the CPUID check: -- int CheckForIntelShaExtensions() { int a, b, c, d; // Look for CPUID.7.0.EBX[29] // EAX = 7, ECX = 0 a = 7; c = 0; asm volatile ("cpuid" :"=a"(a), "=b"(b), "=c"(c), "=d"(d) :"a"(a), "c"(c) ); // Intel® SHA Extensions feature bit is EBX[29] return ((b >> 29) & 1); } -- On CPU with sha_ni the program return "1". Otherwise it return "0". [Test Case] * Reproducible with Xenial/Zesty/Artful release. * Generated a checksum of a big file (e.g. 5GB file) with openssl $ time /usr/bin/openssl dgst -sha256 /var/tmp/5Gfile SHA256(/var/tmp/5Gfile)= 8d448d81521cbc1bfdc04dd199d448bd3c49374221007bd0846d8d39a70dd4f8 real 0m12.835s user 0m12.344s sys 0m0.484s * Openssl speed $ openssl speed sha1 Doing sha1 for 3s on 16 size blocks: 9969152 sha1's in 3.00s Doing sha1 for 3s on 64 size blocks: 8019164 sha1's in 3.00s Doing sha1 for 3s on 256 size blocks: 5254219 sha1's in 2.99s Doing sha1 for 3s on 1024 size blocks: 2217067 sha1's in 3.00s Doing sha1 for 3s on 8192 size blocks: 347842 sha1's in 3.00s OpenSSL 1.0.2g 1 Mar 2016 built on: reproducible build, date unspecified options:bn(64,64) rc4(8x,int) des(idx,cisc,16,int) aes(partial) idea(int) blowfish(idx) compiler: gcc -I. -I.. -I../include -DOPENSSL_THREADS -D_REENTRANT -DDSO_DLFCN -DHAVE_DLFCN_H -Wa,--noexecstack -m64 -DL_ENDIAN -O3 -Wall -DOPENSSL_IA32_SSE2 -DOPENSSL_BN_ASM_MONT -DOPENSSL_BN_ASM_MONT5 -DOPENSSL_BN_ASM_GF2m -DSHA1_ASM -DSHA256_ASM -DSHA512_ASM -DMD5_ASM -DAES_ASM -DVPAES_ASM -DBSAES_ASM -DWHIRLPOOL_ASM -DGHASH_ASM -DECP_NISTZ256_ASM The 'numbers' are in 1000s of bytes per second processed. type 16 bytes 64 bytes 256 bytes 1024 bytes 8192 bytes sha1 53168.81k 171075.50k 449859.55k 756758.87k 949840.55 The performance are clearly better when using the patch which take benefit of the sha extension. (See Regression Potential section for result with patch) [Regression Potential] * Note : IRC discussion with infinity : https://bugs.launchpad.net/ubuntu/xenial/+source/openssl/+bug/1674399/comments/8 * Note from irc discussion with apw and rbasak : https://bugs.launchpad.net/ubuntu/+source/openssl/+bug/1674399/comments/2 * It basically allow openssl to take benefit of sha extension potential (mostly performance-wise) now that new AMD cpu starting to have
[Touch-packages] [Bug 1674399] Re: OpenSSL CPU detection for AMD Ryzen CPUs
Hello Eric, or anyone else affected, Accepted openssl into zesty-proposed. The package will build now and be available at https://launchpad.net/ubuntu/+source/openssl/1.0.2g- 1ubuntu11.2 in a few hours, and then in the -proposed repository. Please help us by testing this new package. See https://wiki.ubuntu.com/Testing/EnableProposed for documentation on how to enable and use -proposed.Your feedback will aid us getting this update out to other Ubuntu users. If this package fixes the bug for you, please add a comment to this bug, mentioning the version of the package you tested, and change the tag from verification-needed to verification-done. If it does not fix the bug for you, please add a comment stating that, and change the tag to verification-failed. In either case, details of your testing will help us make a better decision. Further information regarding the verification process can be found at https://wiki.ubuntu.com/QATeam/PerformingSRUVerification . Thank you in advance! ** Tags removed: verification-failed ** Tags added: verification-needed -- You received this bug notification because you are a member of Ubuntu Touch seeded packages, which is subscribed to openssl in Ubuntu. https://bugs.launchpad.net/bugs/1674399 Title: OpenSSL CPU detection for AMD Ryzen CPUs Status in openssl package in Ubuntu: Fix Released Status in openssl source package in Xenial: Fix Committed Status in openssl source package in Yakkety: Fix Committed Status in openssl source package in Zesty: Fix Committed Status in openssl source package in Artful: Fix Released Bug description: [Impact] * Context: AMD added support in their processors for SHA Extensions[1] (CPU flag: sha_ni[2]) starting with Ryzen[3] CPU. Note that Ryzen CPU come in 64bit only (Confirmed with AMD representative). Current OpenSSL version in Ryzens still calls SHA for SSSE3 routine as result a number of extensions were effectively masked on Ryzen and shows no improvement. [1] /proc/cpuinfo processor : 0 vendor_id : AuthenticAMD cpu family : 23 model : 1 model name : AMD Ryzen 5 1600 Six-Core Processor flags : fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush mmx fxsr sse sse2 ht syscall nx mmxext fxsr_opt pdpe1gb rdtscp lm constant_tsc rep_good nopl nonstop_tsc extd_apicid aperfmperf eagerfpu pni pclmulqdq monitor ssse3 fma cx16 sse 4_1 sse4_2 movbe popcnt aes xsave avx f16c rdrand lahf_lm cmp_legacy svm extapic cr8_legacy abm sse4a misalignsse 3dnowprefetch osvw skinit wdt tce topoext perfctr_core perfctr_nb bpext perfctr_l2 mwaitx hw_pstate vmmcall fsgsbase bmi1 avx2 smep bmi2 rdseed adx smap clflusho pt sha_ni xsaveopt xsavec xgetbv1 clzero arat npt lbrv svm_lock nrip_save tsc_scale vmcb_clean flushbyasid decodeassists pausefilter pfthreshold [2] - sha_ni: SHA1/SHA256 Instruction Extensions [3] - https://en.wikipedia.org/wiki/Ryzen ... All models support: x87, MMX, SSE, SSE2, SSE3, SSSE3, SSE4.1, SSE4.2, AES, CLMUL, AVX, AVX2, FMA, CVT16/F16C, ABM, BMI1, BMI2, SHA.[5] ... * Program to performs the CPUID check: Reference : https://software.intel.com/en-us/articles/intel-sha-extensions ... Availability of the Intel® SHA Extensions on a particular processor can be determined by checking the SHA CPUID bit in CPUID.(EAX=07H, ECX=0):EBX.SHA [bit 29]. The following C function, using inline assembly, performs the CPUID check: -- int CheckForIntelShaExtensions() { int a, b, c, d; // Look for CPUID.7.0.EBX[29] // EAX = 7, ECX = 0 a = 7; c = 0; asm volatile ("cpuid" :"=a"(a), "=b"(b), "=c"(c), "=d"(d) :"a"(a), "c"(c) ); // Intel® SHA Extensions feature bit is EBX[29] return ((b >> 29) & 1); } -- On CPU with sha_ni the program return "1". Otherwise it return "0". [Test Case] * Reproducible with Xenial/Zesty/Artful release. * Generated a checksum of a big file (e.g. 5GB file) with openssl $ time /usr/bin/openssl dgst -sha256 /var/tmp/5Gfile SHA256(/var/tmp/5Gfile)= 8d448d81521cbc1bfdc04dd199d448bd3c49374221007bd0846d8d39a70dd4f8 real 0m12.835s user 0m12.344s sys 0m0.484s * Openssl speed $ openssl speed sha1 Doing sha1 for 3s on 16 size blocks: 9969152 sha1's in 3.00s Doing sha1 for 3s on 64 size blocks: 8019164 sha1's in 3.00s Doing sha1 for 3s on 256 size blocks: 5254219 sha1's in 2.99s Doing sha1 for 3s on 1024 size blocks: 2217067 sha1's in 3.00s Doing sha1 for 3s on 8192 size blocks: 347842 sha1's in 3.00s OpenSSL 1.0.2g 1 Mar 2016 built on: reproducible build, date unspecified options:bn(64,64) rc4(8x,int) des(idx,cisc,16,int) aes(partial) idea(int) blowfish(idx) compiler: gcc -I. -I.. -I../include -DOPENSSL_THREADS -D_REENTRANT -DDSO_DLFCN -DHAVE_DLFCN_H -Wa,--noexecstack -m64 -DL_ENDIAN -O3 -Wall -DOPENSSL_IA32_SSE2 -DOPENSSL_BN_ASM_MONT -DOPENSSL_BN_ASM_MONT5 -DOPENSSL_BN_ASM_GF2m -DSHA1_ASM -D
[Touch-packages] [Bug 1674399] Re: OpenSSL CPU detection for AMD Ryzen CPUs
William, as per our IRC conversation, we have decided that you will do the upload for this specific fix for the 4 releases. Thanks for your collaboration. - Eric -- You received this bug notification because you are a member of Ubuntu Touch seeded packages, which is subscribed to openssl in Ubuntu. https://bugs.launchpad.net/bugs/1674399 Title: OpenSSL CPU detection for AMD Ryzen CPUs Status in openssl package in Ubuntu: Fix Released Status in openssl source package in Xenial: Fix Committed Status in openssl source package in Yakkety: Fix Committed Status in openssl source package in Zesty: Fix Committed Status in openssl source package in Artful: Fix Released Bug description: [Impact] * Context: AMD added support in their processors for SHA Extensions[1] (CPU flag: sha_ni[2]) starting with Ryzen[3] CPU. Note that Ryzen CPU come in 64bit only (Confirmed with AMD representative). Current OpenSSL version in Ryzens still calls SHA for SSSE3 routine as result a number of extensions were effectively masked on Ryzen and shows no improvement. [1] /proc/cpuinfo processor : 0 vendor_id : AuthenticAMD cpu family : 23 model : 1 model name : AMD Ryzen 5 1600 Six-Core Processor flags : fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush mmx fxsr sse sse2 ht syscall nx mmxext fxsr_opt pdpe1gb rdtscp lm constant_tsc rep_good nopl nonstop_tsc extd_apicid aperfmperf eagerfpu pni pclmulqdq monitor ssse3 fma cx16 sse 4_1 sse4_2 movbe popcnt aes xsave avx f16c rdrand lahf_lm cmp_legacy svm extapic cr8_legacy abm sse4a misalignsse 3dnowprefetch osvw skinit wdt tce topoext perfctr_core perfctr_nb bpext perfctr_l2 mwaitx hw_pstate vmmcall fsgsbase bmi1 avx2 smep bmi2 rdseed adx smap clflusho pt sha_ni xsaveopt xsavec xgetbv1 clzero arat npt lbrv svm_lock nrip_save tsc_scale vmcb_clean flushbyasid decodeassists pausefilter pfthreshold [2] - sha_ni: SHA1/SHA256 Instruction Extensions [3] - https://en.wikipedia.org/wiki/Ryzen ... All models support: x87, MMX, SSE, SSE2, SSE3, SSSE3, SSE4.1, SSE4.2, AES, CLMUL, AVX, AVX2, FMA, CVT16/F16C, ABM, BMI1, BMI2, SHA.[5] ... * Program to performs the CPUID check: Reference : https://software.intel.com/en-us/articles/intel-sha-extensions ... Availability of the Intel® SHA Extensions on a particular processor can be determined by checking the SHA CPUID bit in CPUID.(EAX=07H, ECX=0):EBX.SHA [bit 29]. The following C function, using inline assembly, performs the CPUID check: -- int CheckForIntelShaExtensions() { int a, b, c, d; // Look for CPUID.7.0.EBX[29] // EAX = 7, ECX = 0 a = 7; c = 0; asm volatile ("cpuid" :"=a"(a), "=b"(b), "=c"(c), "=d"(d) :"a"(a), "c"(c) ); // Intel® SHA Extensions feature bit is EBX[29] return ((b >> 29) & 1); } -- On CPU with sha_ni the program return "1". Otherwise it return "0". [Test Case] * Reproducible with Xenial/Zesty/Artful release. * Generated a checksum of a big file (e.g. 5GB file) with openssl $ time /usr/bin/openssl dgst -sha256 /var/tmp/5Gfile SHA256(/var/tmp/5Gfile)= 8d448d81521cbc1bfdc04dd199d448bd3c49374221007bd0846d8d39a70dd4f8 real 0m12.835s user 0m12.344s sys 0m0.484s * Openssl speed $ openssl speed sha1 Doing sha1 for 3s on 16 size blocks: 9969152 sha1's in 3.00s Doing sha1 for 3s on 64 size blocks: 8019164 sha1's in 3.00s Doing sha1 for 3s on 256 size blocks: 5254219 sha1's in 2.99s Doing sha1 for 3s on 1024 size blocks: 2217067 sha1's in 3.00s Doing sha1 for 3s on 8192 size blocks: 347842 sha1's in 3.00s OpenSSL 1.0.2g 1 Mar 2016 built on: reproducible build, date unspecified options:bn(64,64) rc4(8x,int) des(idx,cisc,16,int) aes(partial) idea(int) blowfish(idx) compiler: gcc -I. -I.. -I../include -DOPENSSL_THREADS -D_REENTRANT -DDSO_DLFCN -DHAVE_DLFCN_H -Wa,--noexecstack -m64 -DL_ENDIAN -O3 -Wall -DOPENSSL_IA32_SSE2 -DOPENSSL_BN_ASM_MONT -DOPENSSL_BN_ASM_MONT5 -DOPENSSL_BN_ASM_GF2m -DSHA1_ASM -DSHA256_ASM -DSHA512_ASM -DMD5_ASM -DAES_ASM -DVPAES_ASM -DBSAES_ASM -DWHIRLPOOL_ASM -DGHASH_ASM -DECP_NISTZ256_ASM The 'numbers' are in 1000s of bytes per second processed. type 16 bytes 64 bytes 256 bytes 1024 bytes 8192 bytes sha1 53168.81k 171075.50k 449859.55k 756758.87k 949840.55 The performance are clearly better when using the patch which take benefit of the sha extension. (See Regression Potential section for result with patch) [Regression Potential] * Note : IRC discussion with infinity : https://bugs.launchpad.net/ubuntu/xenial/+source/openssl/+bug/1674399/comments/8 * Note from irc discussion with apw and rbasak : https://bugs.launchpad.net/ubuntu/+source/openssl/+bug/1674399/comments/2 * It basically allow openssl to take benefit of sha extension potential (mostly performance-wise) now that new AMD cpu starting to have the capability. * The code
[Touch-packages] [Bug 1674399] Re: OpenSSL CPU detection for AMD Ryzen CPUs
Thanks William, I'll set the proposed pkg as verification-failed, and will work on backporting the patch[1] you are suggesting and that has been proven to fix the issue. [1] - https://github.com/openssl/openssl/commit/08d09628d2c9f3ef599399d8cad021a07ab98347 Eric ** Tags removed: verification-done-xenial verification-done-yakkety verification-done-zesty ** Tags added: verification-failed -- You received this bug notification because you are a member of Ubuntu Touch seeded packages, which is subscribed to openssl in Ubuntu. https://bugs.launchpad.net/bugs/1674399 Title: OpenSSL CPU detection for AMD Ryzen CPUs Status in openssl package in Ubuntu: Fix Released Status in openssl source package in Xenial: Fix Committed Status in openssl source package in Yakkety: Fix Committed Status in openssl source package in Zesty: Fix Committed Status in openssl source package in Artful: Fix Released Bug description: [Impact] * Context: AMD added support in their processors for SHA Extensions[1] (CPU flag: sha_ni[2]) starting with Ryzen[3] CPU. Note that Ryzen CPU come in 64bit only (Confirmed with AMD representative). Current OpenSSL version in Ryzens still calls SHA for SSSE3 routine as result a number of extensions were effectively masked on Ryzen and shows no improvement. [1] /proc/cpuinfo processor : 0 vendor_id : AuthenticAMD cpu family : 23 model : 1 model name : AMD Ryzen 5 1600 Six-Core Processor flags : fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush mmx fxsr sse sse2 ht syscall nx mmxext fxsr_opt pdpe1gb rdtscp lm constant_tsc rep_good nopl nonstop_tsc extd_apicid aperfmperf eagerfpu pni pclmulqdq monitor ssse3 fma cx16 sse 4_1 sse4_2 movbe popcnt aes xsave avx f16c rdrand lahf_lm cmp_legacy svm extapic cr8_legacy abm sse4a misalignsse 3dnowprefetch osvw skinit wdt tce topoext perfctr_core perfctr_nb bpext perfctr_l2 mwaitx hw_pstate vmmcall fsgsbase bmi1 avx2 smep bmi2 rdseed adx smap clflusho pt sha_ni xsaveopt xsavec xgetbv1 clzero arat npt lbrv svm_lock nrip_save tsc_scale vmcb_clean flushbyasid decodeassists pausefilter pfthreshold [2] - sha_ni: SHA1/SHA256 Instruction Extensions [3] - https://en.wikipedia.org/wiki/Ryzen ... All models support: x87, MMX, SSE, SSE2, SSE3, SSSE3, SSE4.1, SSE4.2, AES, CLMUL, AVX, AVX2, FMA, CVT16/F16C, ABM, BMI1, BMI2, SHA.[5] ... * Program to performs the CPUID check: Reference : https://software.intel.com/en-us/articles/intel-sha-extensions ... Availability of the Intel® SHA Extensions on a particular processor can be determined by checking the SHA CPUID bit in CPUID.(EAX=07H, ECX=0):EBX.SHA [bit 29]. The following C function, using inline assembly, performs the CPUID check: -- int CheckForIntelShaExtensions() { int a, b, c, d; // Look for CPUID.7.0.EBX[29] // EAX = 7, ECX = 0 a = 7; c = 0; asm volatile ("cpuid" :"=a"(a), "=b"(b), "=c"(c), "=d"(d) :"a"(a), "c"(c) ); // Intel® SHA Extensions feature bit is EBX[29] return ((b >> 29) & 1); } -- On CPU with sha_ni the program return "1". Otherwise it return "0". [Test Case] * Reproducible with Xenial/Zesty/Artful release. * Generated a checksum of a big file (e.g. 5GB file) with openssl $ time /usr/bin/openssl dgst -sha256 /var/tmp/5Gfile SHA256(/var/tmp/5Gfile)= 8d448d81521cbc1bfdc04dd199d448bd3c49374221007bd0846d8d39a70dd4f8 real 0m12.835s user 0m12.344s sys 0m0.484s * Openssl speed $ openssl speed sha1 Doing sha1 for 3s on 16 size blocks: 9969152 sha1's in 3.00s Doing sha1 for 3s on 64 size blocks: 8019164 sha1's in 3.00s Doing sha1 for 3s on 256 size blocks: 5254219 sha1's in 2.99s Doing sha1 for 3s on 1024 size blocks: 2217067 sha1's in 3.00s Doing sha1 for 3s on 8192 size blocks: 347842 sha1's in 3.00s OpenSSL 1.0.2g 1 Mar 2016 built on: reproducible build, date unspecified options:bn(64,64) rc4(8x,int) des(idx,cisc,16,int) aes(partial) idea(int) blowfish(idx) compiler: gcc -I. -I.. -I../include -DOPENSSL_THREADS -D_REENTRANT -DDSO_DLFCN -DHAVE_DLFCN_H -Wa,--noexecstack -m64 -DL_ENDIAN -O3 -Wall -DOPENSSL_IA32_SSE2 -DOPENSSL_BN_ASM_MONT -DOPENSSL_BN_ASM_MONT5 -DOPENSSL_BN_ASM_GF2m -DSHA1_ASM -DSHA256_ASM -DSHA512_ASM -DMD5_ASM -DAES_ASM -DVPAES_ASM -DBSAES_ASM -DWHIRLPOOL_ASM -DGHASH_ASM -DECP_NISTZ256_ASM The 'numbers' are in 1000s of bytes per second processed. type 16 bytes 64 bytes 256 bytes 1024 bytes 8192 bytes sha1 53168.81k 171075.50k 449859.55k 756758.87k 949840.55 The performance are clearly better when using the patch which take benefit of the sha extension. (See Regression Potential section for result with patch) [Regression Potential] * Note : IRC discussion with infinity : https://bugs.launchpad.net/ubuntu/xenial/+source/openssl/+bug/1674399/comments/8 * Note from irc discussion with apw and rbasak : https://bugs.
[Touch-packages] [Bug 1674399] Re: OpenSSL CPU detection for AMD Ryzen CPUs
Fortunately the OpenSSL test suite also fails when run during the build on Ryzen. It turns out that the AES-NI+SHA-NI AES-CBC+SHA{1,256} implementations are both broken, so https://github.com/openssl/openssl/commit/08d09628d2c9f3ef599399d8cad021a07ab98347 needs to be backported too. I guess nobody's seriously used Ubuntu on Goldmont. I've uploaded fixed SRU test builds to https://launchpad.net/~wgrant/+archive/ubuntu/experimental/+packages?field.name_filter=openssl, and they all build and test successfully on i386 and amd64 on Ryzen. At least 9.3~ppa1 even lets OpenVPN connect, with accelerated hashing. I don't think we really need to dig up a Goldmont device from somewhere, but if someone has one handy... -- You received this bug notification because you are a member of Ubuntu Touch seeded packages, which is subscribed to openssl in Ubuntu. https://bugs.launchpad.net/bugs/1674399 Title: OpenSSL CPU detection for AMD Ryzen CPUs Status in openssl package in Ubuntu: Fix Released Status in openssl source package in Xenial: Fix Committed Status in openssl source package in Yakkety: Fix Committed Status in openssl source package in Zesty: Fix Committed Status in openssl source package in Artful: Fix Released Bug description: [Impact] * Context: AMD added support in their processors for SHA Extensions[1] (CPU flag: sha_ni[2]) starting with Ryzen[3] CPU. Note that Ryzen CPU come in 64bit only (Confirmed with AMD representative). Current OpenSSL version in Ryzens still calls SHA for SSSE3 routine as result a number of extensions were effectively masked on Ryzen and shows no improvement. [1] /proc/cpuinfo processor : 0 vendor_id : AuthenticAMD cpu family : 23 model : 1 model name : AMD Ryzen 5 1600 Six-Core Processor flags : fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush mmx fxsr sse sse2 ht syscall nx mmxext fxsr_opt pdpe1gb rdtscp lm constant_tsc rep_good nopl nonstop_tsc extd_apicid aperfmperf eagerfpu pni pclmulqdq monitor ssse3 fma cx16 sse 4_1 sse4_2 movbe popcnt aes xsave avx f16c rdrand lahf_lm cmp_legacy svm extapic cr8_legacy abm sse4a misalignsse 3dnowprefetch osvw skinit wdt tce topoext perfctr_core perfctr_nb bpext perfctr_l2 mwaitx hw_pstate vmmcall fsgsbase bmi1 avx2 smep bmi2 rdseed adx smap clflusho pt sha_ni xsaveopt xsavec xgetbv1 clzero arat npt lbrv svm_lock nrip_save tsc_scale vmcb_clean flushbyasid decodeassists pausefilter pfthreshold [2] - sha_ni: SHA1/SHA256 Instruction Extensions [3] - https://en.wikipedia.org/wiki/Ryzen ... All models support: x87, MMX, SSE, SSE2, SSE3, SSSE3, SSE4.1, SSE4.2, AES, CLMUL, AVX, AVX2, FMA, CVT16/F16C, ABM, BMI1, BMI2, SHA.[5] ... * Program to performs the CPUID check: Reference : https://software.intel.com/en-us/articles/intel-sha-extensions ... Availability of the Intel® SHA Extensions on a particular processor can be determined by checking the SHA CPUID bit in CPUID.(EAX=07H, ECX=0):EBX.SHA [bit 29]. The following C function, using inline assembly, performs the CPUID check: -- int CheckForIntelShaExtensions() { int a, b, c, d; // Look for CPUID.7.0.EBX[29] // EAX = 7, ECX = 0 a = 7; c = 0; asm volatile ("cpuid" :"=a"(a), "=b"(b), "=c"(c), "=d"(d) :"a"(a), "c"(c) ); // Intel® SHA Extensions feature bit is EBX[29] return ((b >> 29) & 1); } -- On CPU with sha_ni the program return "1". Otherwise it return "0". [Test Case] * Reproducible with Xenial/Zesty/Artful release. * Generated a checksum of a big file (e.g. 5GB file) with openssl $ time /usr/bin/openssl dgst -sha256 /var/tmp/5Gfile SHA256(/var/tmp/5Gfile)= 8d448d81521cbc1bfdc04dd199d448bd3c49374221007bd0846d8d39a70dd4f8 real 0m12.835s user 0m12.344s sys 0m0.484s * Openssl speed $ openssl speed sha1 Doing sha1 for 3s on 16 size blocks: 9969152 sha1's in 3.00s Doing sha1 for 3s on 64 size blocks: 8019164 sha1's in 3.00s Doing sha1 for 3s on 256 size blocks: 5254219 sha1's in 2.99s Doing sha1 for 3s on 1024 size blocks: 2217067 sha1's in 3.00s Doing sha1 for 3s on 8192 size blocks: 347842 sha1's in 3.00s OpenSSL 1.0.2g 1 Mar 2016 built on: reproducible build, date unspecified options:bn(64,64) rc4(8x,int) des(idx,cisc,16,int) aes(partial) idea(int) blowfish(idx) compiler: gcc -I. -I.. -I../include -DOPENSSL_THREADS -D_REENTRANT -DDSO_DLFCN -DHAVE_DLFCN_H -Wa,--noexecstack -m64 -DL_ENDIAN -O3 -Wall -DOPENSSL_IA32_SSE2 -DOPENSSL_BN_ASM_MONT -DOPENSSL_BN_ASM_MONT5 -DOPENSSL_BN_ASM_GF2m -DSHA1_ASM -DSHA256_ASM -DSHA512_ASM -DMD5_ASM -DAES_ASM -DVPAES_ASM -DBSAES_ASM -DWHIRLPOOL_ASM -DGHASH_ASM -DECP_NISTZ256_ASM The 'numbers' are in 1000s of bytes per second processed. type 16 bytes 64 bytes 256 bytes 1024 bytes 8192 bytes sha1 53168.81k 171075.50k 449859.55k 756758.87k 949840.55 The performance are clearly better when using the patch which
[Touch-packages] [Bug 1674399] Re: OpenSSL CPU detection for AMD Ryzen CPUs
libssl1.0.0 1.0.2g-1ubuntu9.2 breaks OpenVPN (2.4.0-5ubuntu1 or 2.3.11-1ubuntu2) connections to Canonical's VPN on my Ryzen 7 1700X desktop running Linux 4.10.0-21-generic. In UDP mode the server stops responding during TLS negotiation, and in TCP mode the server closes the connection at the same stage. Downgrading to ubuntu9.1 fixes it. artful's 1.0.2g-1ubuntu12 is broken in the same way. The HMAC in use by the VPN is SHA-1. >From the server log: ovpn-tcp[30227]: TCP connection established with [AF_INET]:44544 ovpn-tcp[30227]: :44544 TCP connection established with [AF_INET]:47753 ovpn-tcp[30227]: :44544 TLS_ERROR: BIO read tls_read_plaintext error: error:1408F119:SSL routines:SSL3_GET_RECORD:decryption failed or bad record mac ovpn-tcp[30227]: :44544 TLS Error: TLS object -> incoming plaintext read error ovpn-tcp[30227]: :44544 TLS Error: TLS handshake failed ovpn-tcp[30227]: :44544 Fatal TLS error (check_tls_errors_co), restarting The start of a client log is at http://paste.ubuntu.com/24603459/. Until the connection is closed by the server, it differs from a successful connection only in its keys and session IDs. ** Tags added: regression-proposed -- You received this bug notification because you are a member of Ubuntu Touch seeded packages, which is subscribed to openssl in Ubuntu. https://bugs.launchpad.net/bugs/1674399 Title: OpenSSL CPU detection for AMD Ryzen CPUs Status in openssl package in Ubuntu: Fix Released Status in openssl source package in Xenial: Fix Committed Status in openssl source package in Yakkety: Fix Committed Status in openssl source package in Zesty: Fix Committed Status in openssl source package in Artful: Fix Released Bug description: [Impact] * Context: AMD added support in their processors for SHA Extensions[1] (CPU flag: sha_ni[2]) starting with Ryzen[3] CPU. Note that Ryzen CPU come in 64bit only (Confirmed with AMD representative). Current OpenSSL version in Ryzens still calls SHA for SSSE3 routine as result a number of extensions were effectively masked on Ryzen and shows no improvement. [1] /proc/cpuinfo processor : 0 vendor_id : AuthenticAMD cpu family : 23 model : 1 model name : AMD Ryzen 5 1600 Six-Core Processor flags : fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush mmx fxsr sse sse2 ht syscall nx mmxext fxsr_opt pdpe1gb rdtscp lm constant_tsc rep_good nopl nonstop_tsc extd_apicid aperfmperf eagerfpu pni pclmulqdq monitor ssse3 fma cx16 sse 4_1 sse4_2 movbe popcnt aes xsave avx f16c rdrand lahf_lm cmp_legacy svm extapic cr8_legacy abm sse4a misalignsse 3dnowprefetch osvw skinit wdt tce topoext perfctr_core perfctr_nb bpext perfctr_l2 mwaitx hw_pstate vmmcall fsgsbase bmi1 avx2 smep bmi2 rdseed adx smap clflusho pt sha_ni xsaveopt xsavec xgetbv1 clzero arat npt lbrv svm_lock nrip_save tsc_scale vmcb_clean flushbyasid decodeassists pausefilter pfthreshold [2] - sha_ni: SHA1/SHA256 Instruction Extensions [3] - https://en.wikipedia.org/wiki/Ryzen ... All models support: x87, MMX, SSE, SSE2, SSE3, SSSE3, SSE4.1, SSE4.2, AES, CLMUL, AVX, AVX2, FMA, CVT16/F16C, ABM, BMI1, BMI2, SHA.[5] ... * Program to performs the CPUID check: Reference : https://software.intel.com/en-us/articles/intel-sha-extensions ... Availability of the Intel® SHA Extensions on a particular processor can be determined by checking the SHA CPUID bit in CPUID.(EAX=07H, ECX=0):EBX.SHA [bit 29]. The following C function, using inline assembly, performs the CPUID check: -- int CheckForIntelShaExtensions() { int a, b, c, d; // Look for CPUID.7.0.EBX[29] // EAX = 7, ECX = 0 a = 7; c = 0; asm volatile ("cpuid" :"=a"(a), "=b"(b), "=c"(c), "=d"(d) :"a"(a), "c"(c) ); // Intel® SHA Extensions feature bit is EBX[29] return ((b >> 29) & 1); } -- On CPU with sha_ni the program return "1". Otherwise it return "0". [Test Case] * Reproducible with Xenial/Zesty/Artful release. * Generated a checksum of a big file (e.g. 5GB file) with openssl $ time /usr/bin/openssl dgst -sha256 /var/tmp/5Gfile SHA256(/var/tmp/5Gfile)= 8d448d81521cbc1bfdc04dd199d448bd3c49374221007bd0846d8d39a70dd4f8 real 0m12.835s user 0m12.344s sys 0m0.484s * Openssl speed $ openssl speed sha1 Doing sha1 for 3s on 16 size blocks: 9969152 sha1's in 3.00s Doing sha1 for 3s on 64 size blocks: 8019164 sha1's in 3.00s Doing sha1 for 3s on 256 size blocks: 5254219 sha1's in 2.99s Doing sha1 for 3s on 1024 size blocks: 2217067 sha1's in 3.00s Doing sha1 for 3s on 8192 size blocks: 347842 sha1's in 3.00s OpenSSL 1.0.2g 1 Mar 2016 built on: reproducible build, date unspecified options:bn(64,64) rc4(8x,int) des(idx,cisc,16,int) aes(partial) idea(int) blowfish(idx) compiler: gcc -I. -I.. -I../include -DOPENSSL_THREADS -D_REENTRANT -DDSO_DLFCN -DHAVE_DLFCN_H -Wa,--noexecstack -m64 -DL_ENDIAN -O3 -Wall
[Touch-packages] [Bug 1674399] Re: OpenSSL CPU detection for AMD Ryzen CPUs
The same precision verification testing has been tested for zesty- proposed with the same result as X and Y : [Verificaton zesty] # i386 - Significant performance increase using the zesty-proposed/i386 package inside a 32-bit LXD container build using a Ryzen CPU with Intel SHA Extension capability. - Same performance (as expected) using the zesty-proposed/i386 package on a non SHA Extension Intel CPU (i7-6770HQ) with zesty-proposed package. # amd64 - Significant performance increase using the zesty-proposed/amd64 package on Ryzen CPU with Intel SHA Extension capability. - Same performance (as expected) using the zesty-proposed/amd64 package on a non SHA Extension Intel CPU (i7-6770HQ) with zesty-proposed package. Note : I unfortunately don't (nor colleagues) have access to a Intel CPU with SHA Extension capability at our disposal. Ideally, if someone has access to one to test it would be good. Otherwise, I think it is safe to rely on upstream author of the patch who confirmed it was working as expected using a Intel CPU with SHA extension capability. Reference : https://github.com/openssl/openssl/issues/2848 "...Myself I tested on Intel processors, yes, with/without" Additionally, we also had some feedbacks from Justin Erenkrantz, a affected users using a Ryzen/Naple CPU. Please look comment #18 to see Justin feedback: https://bugs.launchpad.net/ubuntu/+source/openssl/+bug/1674399/comments/18 ** Tags removed: sts-sru ** Tags added: sts-sru-done -- You received this bug notification because you are a member of Ubuntu Touch seeded packages, which is subscribed to openssl in Ubuntu. https://bugs.launchpad.net/bugs/1674399 Title: OpenSSL CPU detection for AMD Ryzen CPUs Status in openssl package in Ubuntu: Fix Released Status in openssl source package in Xenial: Fix Committed Status in openssl source package in Yakkety: Fix Committed Status in openssl source package in Zesty: Fix Committed Status in openssl source package in Artful: Fix Released Bug description: [Impact] * Context: AMD added support in their processors for SHA Extensions[1] (CPU flag: sha_ni[2]) starting with Ryzen[3] CPU. Note that Ryzen CPU come in 64bit only (Confirmed with AMD representative). Current OpenSSL version in Ryzens still calls SHA for SSSE3 routine as result a number of extensions were effectively masked on Ryzen and shows no improvement. [1] /proc/cpuinfo processor : 0 vendor_id : AuthenticAMD cpu family : 23 model : 1 model name : AMD Ryzen 5 1600 Six-Core Processor flags : fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush mmx fxsr sse sse2 ht syscall nx mmxext fxsr_opt pdpe1gb rdtscp lm constant_tsc rep_good nopl nonstop_tsc extd_apicid aperfmperf eagerfpu pni pclmulqdq monitor ssse3 fma cx16 sse 4_1 sse4_2 movbe popcnt aes xsave avx f16c rdrand lahf_lm cmp_legacy svm extapic cr8_legacy abm sse4a misalignsse 3dnowprefetch osvw skinit wdt tce topoext perfctr_core perfctr_nb bpext perfctr_l2 mwaitx hw_pstate vmmcall fsgsbase bmi1 avx2 smep bmi2 rdseed adx smap clflusho pt sha_ni xsaveopt xsavec xgetbv1 clzero arat npt lbrv svm_lock nrip_save tsc_scale vmcb_clean flushbyasid decodeassists pausefilter pfthreshold [2] - sha_ni: SHA1/SHA256 Instruction Extensions [3] - https://en.wikipedia.org/wiki/Ryzen ... All models support: x87, MMX, SSE, SSE2, SSE3, SSSE3, SSE4.1, SSE4.2, AES, CLMUL, AVX, AVX2, FMA, CVT16/F16C, ABM, BMI1, BMI2, SHA.[5] ... * Program to performs the CPUID check: Reference : https://software.intel.com/en-us/articles/intel-sha-extensions ... Availability of the Intel® SHA Extensions on a particular processor can be determined by checking the SHA CPUID bit in CPUID.(EAX=07H, ECX=0):EBX.SHA [bit 29]. The following C function, using inline assembly, performs the CPUID check: -- int CheckForIntelShaExtensions() { int a, b, c, d; // Look for CPUID.7.0.EBX[29] // EAX = 7, ECX = 0 a = 7; c = 0; asm volatile ("cpuid" :"=a"(a), "=b"(b), "=c"(c), "=d"(d) :"a"(a), "c"(c) ); // Intel® SHA Extensions feature bit is EBX[29] return ((b >> 29) & 1); } -- On CPU with sha_ni the program return "1". Otherwise it return "0". [Test Case] * Reproducible with Xenial/Zesty/Artful release. * Generated a checksum of a big file (e.g. 5GB file) with openssl $ time /usr/bin/openssl dgst -sha256 /var/tmp/5Gfile SHA256(/var/tmp/5Gfile)= 8d448d81521cbc1bfdc04dd199d448bd3c49374221007bd0846d8d39a70dd4f8 real 0m12.835s user 0m12.344s sys 0m0.484s * Openssl speed $ openssl speed sha1 Doing sha1 for 3s on 16 size blocks: 9969152 sha1's in 3.00s Doing sha1 for 3s on 64 size blocks: 8019164 sha1's in 3.00s Doing sha1 for 3s on 256 size blocks: 5254219 sha1's in 2.99s Doing sha1 for 3s on 1024 size blocks: 2217067 sha1's in 3.00s Doing sha1 for 3s on 8192 size blocks: 347842 sha1's in
[Touch-packages] [Bug 1674399] Re: OpenSSL CPU detection for AMD Ryzen CPUs
[Verificaton XENIAL] # i386 - Significant performance increase using the xenial-proposed/i386 package inside a 32-bit LXD container build using a Ryzen CPU with Intel SHA Extension capability. - Same performance (as expected) using the xenial-proposed/i386 package on a non SHA Extension Intel CPU (i7-6770HQ) with xenial-proposed package. # amd64 - Significant performance increase using the xenial-proposed/amd64 package on Ryzen CPU with Intel SHA Extension capability. - Same performance (as expected) using the xenial-proposed/amd64 package on a non SHA Extension Intel CPU (i7-6770HQ) with xenial-proposed package. Note : I unfortunately don't (nor colleagues) have access to a Intel CPU with SHA Extension capability at our disposal. Ideally, if someone has access to one to test it would be good. Otherwise, I think it is safe to rely on upstream author of the patch who confirmed it was working as expected using a Intel CPU with SHA extension capability. Reference : https://github.com/openssl/openssl/issues/2848 "...Myself I tested on Intel processors, yes, with/without" == * Test xenial/i386 on a 32-bit LXD container using a non SHA Extension Intel CPU: -- ii libssl1.0.0:i386 1.0.2g-1ubuntu4.6 i386 Secure Sockets Layer toolkit - shared libraries ii openssl 1.0.2g-1ubuntu4.6 i386 Secure Sockets Layer toolkit - cryptographic utility # openssl speed sha1 Doing sha1 for 3s on 16 size blocks: 12391058 sha1's in 3.00s Doing sha1 for 3s on 64 size blocks: 8934411 sha1's in 3.00s Doing sha1 for 3s on 256 size blocks: 5048901 sha1's in 3.00s Doing sha1 for 3s on 1024 size blocks: 1893157 sha1's in 3.00s Doing sha1 for 3s on 8192 size blocks: 301374 sha1's in 3.00s OpenSSL 1.0.2g 1 Mar 2016 built on: reproducible build, date unspecified options:bn(64,32) rc4(8x,mmx) des(ptr,risc1,16,long) aes(partial) blowfish(idx) compiler: cc -I. -I.. -I../include -fPIC -DOPENSSL_PIC -DOPENSSL_THREADS -D_REENTRANT -DDSO_DLFCN -DHAVE_DLFCN_H -DL_ENDIAN -g -O2 -fstack-protector-strong -Wformat -Werror=format-security -Wdate-time -D_FORTIFY_SOURCE=2 -Wl,-Bsymbolic-functions -Wl,-z,relro -Wa,--noexecstack -Wall -DOPENSSL_BN_ASM_PART_WORDS -DOPENSSL_IA32_SSE2 -DOPENSSL_BN_ASM_MONT -DOPENSSL_BN_ASM_GF2m -DSHA1_ASM -DSHA256_ASM -DSHA512_ASM -DMD5_ASM -DRMD160_ASM -DAES_ASM -DVPAES_ASM -DWHIRLPOOL_ASM -DGHASH_ASM The 'numbers' are in 1000s of bytes per second processed. type 16 bytes 64 bytes256 bytes 1024 bytes 8192 bytes sha1 66085.64k 190600.77k 430839.55k 646197.59k 822951.94k # time openssl dgst -sha256 /var/tmp/5Gfile SHA256(/var/tmp/5Gfile)= 7f06c62352aebd8125b2a1841e2b9e1ffcbed602f381c3dcb3200200e383d1d5 real0m15.518s user0m14.428s sys 0m1.084s == * Test xenial-proposed/i386 on a 32-bit LXD container using a non SHA Extension Intel CPU: -- ii libssl1.0.0:i386 1.0.2g-1ubuntu4.7 i386 Secure Sockets Layer toolkit - shared libraries ii openssl 1.0.2g-1ubuntu4.7 i386 Secure Sockets Layer toolkit - cryptographic utility # openssl speed sha1 Doing sha1 for 3s on 16 size blocks: 12451389 sha1's in 3.00s Doing sha1 for 3s on 64 size blocks: 8913173 sha1's in 3.00s Doing sha1 for 3s on 256 size blocks: 5037978 sha1's in 3.00s Doing sha1 for 3s on 1024 size blocks: 1904530 sha1's in 3.00s Doing sha1 for 3s on 8192 size blocks: 303177 sha1's in 3.00s OpenSSL 1.0.2g 1 Mar 2016 built on: reproducible build, date unspecified options:bn(64,32) rc4(8x,mmx) des(ptr,risc1,16,long) aes(partial) blowfish(idx) compiler: cc -I. -I.. -I../include -fPIC -DOPENSSL_PIC -DOPENSSL_THREADS -D_REENTRANT -DDSO_DLFCN -DHAVE_DLFCN_H -DL_ENDIAN -g -O2 -fstack-protector-strong -Wformat -Werror=format-security -Wdate-time -D_FORTIFY_SOURCE=2 -Wl,-Bsymbolic-functions -Wl,-z,relro -Wa,--noexecstack -Wall -DOPENSSL_BN_ASM_PART_WORDS -DOPENSSL_IA32_SSE2 -DOPENSSL_BN_ASM_MONT -DOPENSSL_BN_ASM_GF2m -DSHA1_ASM -DSHA256_ASM -DSHA512_ASM -DMD5_ASM -DRMD160_ASM -DAES_ASM -DVPAES_ASM -DWHIRLPOOL_ASM -DGHASH_ASM The 'numbers' are in 1000s of bytes per second processed. type 16 bytes 64 bytes256 bytes 1024 bytes 8192 bytes sha1 66407.41k 190147.69k 429907.46k 650079.57k 827875.33k # time openssl dgst -sha256 /var/tmp/5Gfile SHA256(/var/tmp/5Gfile)= 7f06c62352aebd8125b2a1841e2b9e1ffcbed602f381c3dcb3200200e383d1d5 real0m15.259s user0m14.372s sys 0m0.884s == * Test xenial/i386 on a 32-bit LXD container using a Ryzen CPU: -- ii libssl1.0.0:i386 1.0.2g-1ubuntu4.6 i386 Secure Sockets Layer toolkit - shared libraries ii openssl 1.0.2g-1ubuntu4.6 i386 Secure Sockets Layer toolkit - cryptographic utility # openssl speed sha1 Doing s
[Touch-packages] [Bug 1674399] Re: OpenSSL CPU detection for AMD Ryzen CPUs
We have tested these packages in zesty-proposed (openssl-1.0.2g- 1ubuntu11.1) and can confirm that the SHA extension codepath is executed correctly and we see the accompanying expected performance improvements. Thanks! -- You received this bug notification because you are a member of Ubuntu Touch seeded packages, which is subscribed to openssl in Ubuntu. https://bugs.launchpad.net/bugs/1674399 Title: OpenSSL CPU detection for AMD Ryzen CPUs Status in openssl package in Ubuntu: Fix Released Status in openssl source package in Xenial: Fix Committed Status in openssl source package in Yakkety: Fix Committed Status in openssl source package in Zesty: Fix Committed Status in openssl source package in Artful: Fix Released Bug description: [Impact] * Context: AMD added support in their processors for SHA Extensions[1] (CPU flag: sha_ni[2]) starting with Ryzen[3] CPU. Note that Ryzen CPU come in 64bit only (Confirmed with AMD representative). Current OpenSSL version in Ryzens still calls SHA for SSSE3 routine as result a number of extensions were effectively masked on Ryzen and shows no improvement. [1] /proc/cpuinfo processor : 0 vendor_id : AuthenticAMD cpu family : 23 model : 1 model name : AMD Ryzen 5 1600 Six-Core Processor flags : fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush mmx fxsr sse sse2 ht syscall nx mmxext fxsr_opt pdpe1gb rdtscp lm constant_tsc rep_good nopl nonstop_tsc extd_apicid aperfmperf eagerfpu pni pclmulqdq monitor ssse3 fma cx16 sse 4_1 sse4_2 movbe popcnt aes xsave avx f16c rdrand lahf_lm cmp_legacy svm extapic cr8_legacy abm sse4a misalignsse 3dnowprefetch osvw skinit wdt tce topoext perfctr_core perfctr_nb bpext perfctr_l2 mwaitx hw_pstate vmmcall fsgsbase bmi1 avx2 smep bmi2 rdseed adx smap clflusho pt sha_ni xsaveopt xsavec xgetbv1 clzero arat npt lbrv svm_lock nrip_save tsc_scale vmcb_clean flushbyasid decodeassists pausefilter pfthreshold [2] - sha_ni: SHA1/SHA256 Instruction Extensions [3] - https://en.wikipedia.org/wiki/Ryzen ... All models support: x87, MMX, SSE, SSE2, SSE3, SSSE3, SSE4.1, SSE4.2, AES, CLMUL, AVX, AVX2, FMA, CVT16/F16C, ABM, BMI1, BMI2, SHA.[5] ... * Program to performs the CPUID check: Reference : https://software.intel.com/en-us/articles/intel-sha-extensions ... Availability of the Intel® SHA Extensions on a particular processor can be determined by checking the SHA CPUID bit in CPUID.(EAX=07H, ECX=0):EBX.SHA [bit 29]. The following C function, using inline assembly, performs the CPUID check: -- int CheckForIntelShaExtensions() { int a, b, c, d; // Look for CPUID.7.0.EBX[29] // EAX = 7, ECX = 0 a = 7; c = 0; asm volatile ("cpuid" :"=a"(a), "=b"(b), "=c"(c), "=d"(d) :"a"(a), "c"(c) ); // Intel® SHA Extensions feature bit is EBX[29] return ((b >> 29) & 1); } -- On CPU with sha_ni the program return "1". Otherwise it return "0". [Test Case] * Reproducible with Xenial/Zesty/Artful release. * Generated a checksum of a big file (e.g. 5GB file) with openssl $ time /usr/bin/openssl dgst -sha256 /var/tmp/5Gfile SHA256(/var/tmp/5Gfile)= 8d448d81521cbc1bfdc04dd199d448bd3c49374221007bd0846d8d39a70dd4f8 real 0m12.835s user 0m12.344s sys 0m0.484s * Openssl speed $ openssl speed sha1 Doing sha1 for 3s on 16 size blocks: 9969152 sha1's in 3.00s Doing sha1 for 3s on 64 size blocks: 8019164 sha1's in 3.00s Doing sha1 for 3s on 256 size blocks: 5254219 sha1's in 2.99s Doing sha1 for 3s on 1024 size blocks: 2217067 sha1's in 3.00s Doing sha1 for 3s on 8192 size blocks: 347842 sha1's in 3.00s OpenSSL 1.0.2g 1 Mar 2016 built on: reproducible build, date unspecified options:bn(64,64) rc4(8x,int) des(idx,cisc,16,int) aes(partial) idea(int) blowfish(idx) compiler: gcc -I. -I.. -I../include -DOPENSSL_THREADS -D_REENTRANT -DDSO_DLFCN -DHAVE_DLFCN_H -Wa,--noexecstack -m64 -DL_ENDIAN -O3 -Wall -DOPENSSL_IA32_SSE2 -DOPENSSL_BN_ASM_MONT -DOPENSSL_BN_ASM_MONT5 -DOPENSSL_BN_ASM_GF2m -DSHA1_ASM -DSHA256_ASM -DSHA512_ASM -DMD5_ASM -DAES_ASM -DVPAES_ASM -DBSAES_ASM -DWHIRLPOOL_ASM -DGHASH_ASM -DECP_NISTZ256_ASM The 'numbers' are in 1000s of bytes per second processed. type 16 bytes 64 bytes 256 bytes 1024 bytes 8192 bytes sha1 53168.81k 171075.50k 449859.55k 756758.87k 949840.55 The performance are clearly better when using the patch which take benefit of the sha extension. (See Regression Potential section for result with patch) [Regression Potential] * Note : IRC discussion with infinity : https://bugs.launchpad.net/ubuntu/xenial/+source/openssl/+bug/1674399/comments/8 * Note from irc discussion with apw and rbasak : https://bugs.launchpad.net/ubuntu/+source/openssl/+bug/1674399/comments/2 * It basically allow openssl to take benefit of sha extension potential (mostly performance-wise) now that new A
[Touch-packages] [Bug 1674399] Re: OpenSSL CPU detection for AMD Ryzen CPUs
Hello Eric, or anyone else affected, Accepted openssl into zesty-proposed. The package will build now and be available at https://launchpad.net/ubuntu/+source/openssl/1.0.2g- 1ubuntu11.1 in a few hours, and then in the -proposed repository. Please help us by testing this new package. See https://wiki.ubuntu.com/Testing/EnableProposed for documentation on how to enable and use -proposed.Your feedback will aid us getting this update out to other Ubuntu users. If this package fixes the bug for you, please add a comment to this bug, mentioning the version of the package you tested, and change the tag from verification-needed to verification-done. If it does not fix the bug for you, please add a comment stating that, and change the tag to verification-failed. In either case, details of your testing will help us make a better decision. Further information regarding the verification process can be found at https://wiki.ubuntu.com/QATeam/PerformingSRUVerification . Thank you in advance! ** Changed in: openssl (Ubuntu Zesty) Status: In Progress => Fix Committed ** Tags added: verification-needed ** Changed in: openssl (Ubuntu Xenial) Status: In Progress => Fix Committed -- You received this bug notification because you are a member of Ubuntu Touch seeded packages, which is subscribed to openssl in Ubuntu. https://bugs.launchpad.net/bugs/1674399 Title: OpenSSL CPU detection for AMD Ryzen CPUs Status in openssl package in Ubuntu: Fix Released Status in openssl source package in Xenial: Fix Committed Status in openssl source package in Yakkety: Fix Committed Status in openssl source package in Zesty: Fix Committed Status in openssl source package in Artful: Fix Released Bug description: [Impact] * Context: AMD added support in their processors for SHA Extensions[1] (CPU flag: sha_ni[2]) starting with Ryzen[3] CPU. Note that Ryzen CPU come in 64bit only (Confirmed with AMD representative). Current OpenSSL version in Ryzens still calls SHA for SSSE3 routine as result a number of extensions were effectively masked on Ryzen and shows no improvement. [1] /proc/cpuinfo processor : 0 vendor_id : AuthenticAMD cpu family : 23 model : 1 model name : AMD Ryzen 5 1600 Six-Core Processor flags : fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush mmx fxsr sse sse2 ht syscall nx mmxext fxsr_opt pdpe1gb rdtscp lm constant_tsc rep_good nopl nonstop_tsc extd_apicid aperfmperf eagerfpu pni pclmulqdq monitor ssse3 fma cx16 sse 4_1 sse4_2 movbe popcnt aes xsave avx f16c rdrand lahf_lm cmp_legacy svm extapic cr8_legacy abm sse4a misalignsse 3dnowprefetch osvw skinit wdt tce topoext perfctr_core perfctr_nb bpext perfctr_l2 mwaitx hw_pstate vmmcall fsgsbase bmi1 avx2 smep bmi2 rdseed adx smap clflusho pt sha_ni xsaveopt xsavec xgetbv1 clzero arat npt lbrv svm_lock nrip_save tsc_scale vmcb_clean flushbyasid decodeassists pausefilter pfthreshold [2] - sha_ni: SHA1/SHA256 Instruction Extensions [3] - https://en.wikipedia.org/wiki/Ryzen ... All models support: x87, MMX, SSE, SSE2, SSE3, SSSE3, SSE4.1, SSE4.2, AES, CLMUL, AVX, AVX2, FMA, CVT16/F16C, ABM, BMI1, BMI2, SHA.[5] ... * Program to performs the CPUID check: Reference : https://software.intel.com/en-us/articles/intel-sha-extensions ... Availability of the Intel® SHA Extensions on a particular processor can be determined by checking the SHA CPUID bit in CPUID.(EAX=07H, ECX=0):EBX.SHA [bit 29]. The following C function, using inline assembly, performs the CPUID check: -- int CheckForIntelShaExtensions() { int a, b, c, d; // Look for CPUID.7.0.EBX[29] // EAX = 7, ECX = 0 a = 7; c = 0; asm volatile ("cpuid" :"=a"(a), "=b"(b), "=c"(c), "=d"(d) :"a"(a), "c"(c) ); // Intel® SHA Extensions feature bit is EBX[29] return ((b >> 29) & 1); } -- On CPU with sha_ni the program return "1". Otherwise it return "0". [Test Case] * Reproducible with Xenial/Zesty/Artful release. * Generated a checksum of a big file (e.g. 5GB file) with openssl $ time /usr/bin/openssl dgst -sha256 /var/tmp/5Gfile SHA256(/var/tmp/5Gfile)= 8d448d81521cbc1bfdc04dd199d448bd3c49374221007bd0846d8d39a70dd4f8 real 0m12.835s user 0m12.344s sys 0m0.484s * Openssl speed $ openssl speed sha1 Doing sha1 for 3s on 16 size blocks: 9969152 sha1's in 3.00s Doing sha1 for 3s on 64 size blocks: 8019164 sha1's in 3.00s Doing sha1 for 3s on 256 size blocks: 5254219 sha1's in 2.99s Doing sha1 for 3s on 1024 size blocks: 2217067 sha1's in 3.00s Doing sha1 for 3s on 8192 size blocks: 347842 sha1's in 3.00s OpenSSL 1.0.2g 1 Mar 2016 built on: reproducible build, date unspecified options:bn(64,64) rc4(8x,int) des(idx,cisc,16,int) aes(partial) idea(int) blowfish(idx) compiler: gcc -I. -I.. -I../include -DOPENSSL_THREADS -D_REENTRANT -DDSO_DLFCN -DHAVE_DLFCN_H -Wa,--noexecstack
[Touch-packages] [Bug 1674399] Re: OpenSSL CPU detection for AMD Ryzen CPUs
[Verificaton YAKKETY] # i386 - Significant performance increase using the yakkety-proposed/i386 package inside a 32-bit LXD container build using a Ryzen CPU with Intel SHA Extension capability. - Same performance (as expected) using the yakkety-proposed/i386 package on a non SHA Extension Intel CPU (i7-6770HQ) with yakkety-proposed package. # amd64 - Significant performance increase using the yakkety-proposed/amd64 package on Ryzen CPU with Intel SHA Extension capability. - Same performance (as expected) using the yakkety-proposed/amd64 package on a non SHA Extension Intel CPU (i7-6770HQ) with yakkety-proposed package. Note : I unfortunately don't (nor colleagues) have access to a Intel CPU with SHA Extension capability at our disposal. Ideally, if someone has access to one to test it would be good. Otherwise, I think it is safe to rely on upstream author of the patch who confirmed it was working as expected using a Intel CPU with SHA extension capability. Reference : https://github.com/openssl/openssl/issues/2848 "...Myself I tested on Intel processors, yes, with/without" == * Test yakkety-proposed/i386 on a 32-bit LXD container using a non SHA Extension Intel CPU (Version before -proposed pkg): -- ii libssl1.0.0:i386 1.0.2g-1ubuntu9.1 i386 Secure Sockets Layer toolkit - shared libraries ii openssl 1.0.2g-1ubuntu9.1 i386 Secure Sockets Layer toolkit - cryptographic utility # openssl speed sha1 Doing sha1 for 3s on 16 size blocks: 12441833 sha1's in 3.00s Doing sha1 for 3s on 64 size blocks: 8997589 sha1's in 3.00s Doing sha1 for 3s on 256 size blocks: 5074636 sha1's in 3.00s Doing sha1 for 3s on 1024 size blocks: 1904828 sha1's in 3.00s Doing sha1 for 3s on 8192 size blocks: 304739 sha1's in 3.00s OpenSSL 1.0.2g 1 Mar 2016 built on: reproducible build, date unspecified options:bn(64,32) rc4(8x,mmx) des(ptr,risc1,16,long) aes(partial) blowfish(idx) compiler: cc -I. -I.. -I../include -fPIC -DOPENSSL_PIC -DOPENSSL_THREADS -D_REENTRANT -DDSO_DLFCN -DHAVE_DLFCN_H -DL_ENDIAN -g -O2 -fdebug-prefix-map=/build/openssl-OIx07U/openssl-1.0.2g=. -fstack-protector-strong -Wformat -Werror=format-security -Wdate-time -D_FORTIFY_SOUR CE=2 -Wl,-Bsymbolic-functions -Wl,-z,relro -Wa,--noexecstack -Wall -DOPENSSL_BN_ASM_PART_WORDS -DOPENSSL_IA32_SSE2 -DOPENSSL_BN_ASM_MONT -DOPENSSL_BN_ASM_GF2m -DSHA1_ASM -DSHA256_ASM -DSHA512_ASM -DMD5_ASM -DRMD160_ASM -DAES_ASM -DVPAES_ASM -DWHIRLPOOL_ASM -DGHASH_ASM The 'numbers' are in 1000s of bytes per second processed. type 16 bytes 64 bytes256 bytes 1024 bytes 8192 bytes sha1 66356.44k 191948.57k 433035.61k 650181.29k 832140.63k # time openssl dgst -sha256 /var/tmp/5Gfile SHA256(/var/tmp/5Gfile)= 7f06c62352aebd8125b2a1841e2b9e1ffcbed602f381c3dcb3200200e383d1d5 real0m15.429s user0m14.372s sys 0m1.052s == * Test yakkety-proposed/i386 on a 32-bit LXD container using a non SHA Extension Intel CPU (With -proposed pkg): -- ii libssl1.0.0:i386 1.0.2g-1ubuntu9.2 i386 Secure Sockets Layer toolkit - shared libraries ii openssl 1.0.2g-1ubuntu9.2 i386 Secure Sockets Layer toolkit - cryptographic utility # openssl speed sha1 Doing sha1 for 3s on 16 size blocks: 12414183 sha1's in 3.00s Doing sha1 for 3s on 64 size blocks: 8947717 sha1's in 3.00s Doing sha1 for 3s on 256 size blocks: 5057099 sha1's in 3.00s Doing sha1 for 3s on 1024 size blocks: 1905356 sha1's in 3.00s Doing sha1 for 3s on 8192 size blocks: 304628 sha1's in 3.00s OpenSSL 1.0.2g 1 Mar 2016 built on: reproducible build, date unspecified options:bn(64,32) rc4(8x,mmx) des(ptr,risc1,16,long) aes(partial) blowfish(idx) compiler: cc -I. -I.. -I../include -fPIC -DOPENSSL_PIC -DOPENSSL_THREADS -D_REENTRANT -DDSO_DLFCN -DHAVE_DLFCN_H -DL_ENDIAN -g -O2 -fdebug-prefix-map=/build/openssl-h4cyBe/openssl-1.0.2g=. -fstack-protector-strong -Wformat -Werror=format-security -Wdate-time -D_FORTIFY_SOUR CE=2 -Wl,-Bsymbolic-functions -Wl,-z,relro -Wa,--noexecstack -Wall -DOPENSSL_BN_ASM_PART_WORDS -DOPENSSL_IA32_SSE2 -DOPENSSL_BN_ASM_MONT -DOPENSSL_BN_ASM_GF2m -DSHA1_ASM -DSHA256_ASM -DSHA512_ASM -DMD5_ASM -DRMD160_ASM -DAES_ASM -DVPAES_ASM -DWHIRLPOOL_ASM -DGHASH_ASM The 'numbers' are in 1000s of bytes per second processed. type 16 bytes 64 bytes256 bytes 1024 bytes 8192 bytes sha1 66208.98k 190884.63k 431539.11k 650361.51k 831837.53k # time openssl dgst -sha256 /var/tmp/5Gfile SHA256(/var/tmp/5Gfile)= 7f06c62352aebd8125b2a1841e2b9e1ffcbed602f381c3dcb3200200e383d1d5 real0m15.047s user0m14.352s sys 0m0.692s == * Test yakkety-proposed/i386 on a 32-bit LXD container using a Ryzen CPU (Version before -proposed pkg): -- ii libssl1.0.0:i386 1.0.2g-1ubuntu9.1
[Touch-packages] [Bug 1674399] Re: OpenSSL CPU detection for AMD Ryzen CPUs
Hello Eric, or anyone else affected, Accepted openssl into yakkety-proposed. The package will build now and be available at https://launchpad.net/ubuntu/+source/openssl/1.0.2g- 1ubuntu9.2 in a few hours, and then in the -proposed repository. Please help us by testing this new package. See https://wiki.ubuntu.com/Testing/EnableProposed for documentation on how to enable and use -proposed.Your feedback will aid us getting this update out to other Ubuntu users. If this package fixes the bug for you, please add a comment to this bug, mentioning the version of the package you tested, and change the tag from verification-needed to verification-done. If it does not fix the bug for you, please add a comment stating that, and change the tag to verification-failed. In either case, details of your testing will help us make a better decision. Further information regarding the verification process can be found at https://wiki.ubuntu.com/QATeam/PerformingSRUVerification . Thank you in advance! ** Changed in: openssl (Ubuntu Yakkety) Status: In Progress => Fix Committed ** Tags added: verification-needed -- You received this bug notification because you are a member of Ubuntu Touch seeded packages, which is subscribed to openssl in Ubuntu. https://bugs.launchpad.net/bugs/1674399 Title: OpenSSL CPU detection for AMD Ryzen CPUs Status in openssl package in Ubuntu: Fix Released Status in openssl source package in Xenial: In Progress Status in openssl source package in Yakkety: Fix Committed Status in openssl source package in Zesty: In Progress Status in openssl source package in Artful: Fix Released Bug description: [Impact] * Context: AMD added support in their processors for SHA Extensions[1] (CPU flag: sha_ni[2]) starting with Ryzen[3] CPU. Note that Ryzen CPU come in 64bit only (Confirmed with AMD representative). Current OpenSSL version in Ryzens still calls SHA for SSSE3 routine as result a number of extensions were effectively masked on Ryzen and shows no improvement. [1] /proc/cpuinfo processor : 0 vendor_id : AuthenticAMD cpu family : 23 model : 1 model name : AMD Ryzen 5 1600 Six-Core Processor flags : fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush mmx fxsr sse sse2 ht syscall nx mmxext fxsr_opt pdpe1gb rdtscp lm constant_tsc rep_good nopl nonstop_tsc extd_apicid aperfmperf eagerfpu pni pclmulqdq monitor ssse3 fma cx16 sse 4_1 sse4_2 movbe popcnt aes xsave avx f16c rdrand lahf_lm cmp_legacy svm extapic cr8_legacy abm sse4a misalignsse 3dnowprefetch osvw skinit wdt tce topoext perfctr_core perfctr_nb bpext perfctr_l2 mwaitx hw_pstate vmmcall fsgsbase bmi1 avx2 smep bmi2 rdseed adx smap clflusho pt sha_ni xsaveopt xsavec xgetbv1 clzero arat npt lbrv svm_lock nrip_save tsc_scale vmcb_clean flushbyasid decodeassists pausefilter pfthreshold [2] - sha_ni: SHA1/SHA256 Instruction Extensions [3] - https://en.wikipedia.org/wiki/Ryzen ... All models support: x87, MMX, SSE, SSE2, SSE3, SSSE3, SSE4.1, SSE4.2, AES, CLMUL, AVX, AVX2, FMA, CVT16/F16C, ABM, BMI1, BMI2, SHA.[5] ... * Program to performs the CPUID check: Reference : https://software.intel.com/en-us/articles/intel-sha-extensions ... Availability of the Intel® SHA Extensions on a particular processor can be determined by checking the SHA CPUID bit in CPUID.(EAX=07H, ECX=0):EBX.SHA [bit 29]. The following C function, using inline assembly, performs the CPUID check: -- int CheckForIntelShaExtensions() { int a, b, c, d; // Look for CPUID.7.0.EBX[29] // EAX = 7, ECX = 0 a = 7; c = 0; asm volatile ("cpuid" :"=a"(a), "=b"(b), "=c"(c), "=d"(d) :"a"(a), "c"(c) ); // Intel® SHA Extensions feature bit is EBX[29] return ((b >> 29) & 1); } -- On CPU with sha_ni the program return "1". Otherwise it return "0". [Test Case] * Reproducible with Xenial/Zesty/Artful release. * Generated a checksum of a big file (e.g. 5GB file) with openssl $ time /usr/bin/openssl dgst -sha256 /var/tmp/5Gfile SHA256(/var/tmp/5Gfile)= 8d448d81521cbc1bfdc04dd199d448bd3c49374221007bd0846d8d39a70dd4f8 real 0m12.835s user 0m12.344s sys 0m0.484s * Openssl speed $ openssl speed sha1 Doing sha1 for 3s on 16 size blocks: 9969152 sha1's in 3.00s Doing sha1 for 3s on 64 size blocks: 8019164 sha1's in 3.00s Doing sha1 for 3s on 256 size blocks: 5254219 sha1's in 2.99s Doing sha1 for 3s on 1024 size blocks: 2217067 sha1's in 3.00s Doing sha1 for 3s on 8192 size blocks: 347842 sha1's in 3.00s OpenSSL 1.0.2g 1 Mar 2016 built on: reproducible build, date unspecified options:bn(64,64) rc4(8x,int) des(idx,cisc,16,int) aes(partial) idea(int) blowfish(idx) compiler: gcc -I. -I.. -I../include -DOPENSSL_THREADS -D_REENTRANT -DDSO_DLFCN -DHAVE_DLFCN_H -Wa,--noexecstack -m64 -DL_ENDIAN -O3 -Wall -DOPENSSL_IA32_SSE2 -DOPENSSL_BN_ASM_MONT -DOPENSSL_BN_AS
[Touch-packages] [Bug 1674399] Re: OpenSSL CPU detection for AMD Ryzen CPUs
yakkety_openssl_lp1674399.debdiff ** Patch added: "yakkety_openssl_lp1674399.debdiff" https://bugs.launchpad.net/ubuntu/+source/openssl/+bug/1674399/+attachment/4868959/+files/yakkety_openssl_lp1674399.debdiff -- You received this bug notification because you are a member of Ubuntu Touch seeded packages, which is subscribed to openssl in Ubuntu. https://bugs.launchpad.net/bugs/1674399 Title: OpenSSL CPU detection for AMD Ryzen CPUs Status in openssl package in Ubuntu: Fix Released Status in openssl source package in Xenial: In Progress Status in openssl source package in Yakkety: In Progress Status in openssl source package in Zesty: In Progress Status in openssl source package in Artful: Fix Released Bug description: [Impact] * Context: AMD added support in their processors for SHA Extensions[1] (CPU flag: sha_ni[2]) starting with Ryzen[3] CPU. Note that Ryzen CPU come in 64bit only (Confirmed with AMD representative). Current OpenSSL version in Ryzens still calls SHA for SSSE3 routine as result a number of extensions were effectively masked on Ryzen and shows no improvement. [1] /proc/cpuinfo processor : 0 vendor_id : AuthenticAMD cpu family : 23 model : 1 model name : AMD Ryzen 5 1600 Six-Core Processor flags : fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush mmx fxsr sse sse2 ht syscall nx mmxext fxsr_opt pdpe1gb rdtscp lm constant_tsc rep_good nopl nonstop_tsc extd_apicid aperfmperf eagerfpu pni pclmulqdq monitor ssse3 fma cx16 sse 4_1 sse4_2 movbe popcnt aes xsave avx f16c rdrand lahf_lm cmp_legacy svm extapic cr8_legacy abm sse4a misalignsse 3dnowprefetch osvw skinit wdt tce topoext perfctr_core perfctr_nb bpext perfctr_l2 mwaitx hw_pstate vmmcall fsgsbase bmi1 avx2 smep bmi2 rdseed adx smap clflusho pt sha_ni xsaveopt xsavec xgetbv1 clzero arat npt lbrv svm_lock nrip_save tsc_scale vmcb_clean flushbyasid decodeassists pausefilter pfthreshold [2] - sha_ni: SHA1/SHA256 Instruction Extensions [3] - https://en.wikipedia.org/wiki/Ryzen ... All models support: x87, MMX, SSE, SSE2, SSE3, SSSE3, SSE4.1, SSE4.2, AES, CLMUL, AVX, AVX2, FMA, CVT16/F16C, ABM, BMI1, BMI2, SHA.[5] ... * Program to performs the CPUID check: Reference : https://software.intel.com/en-us/articles/intel-sha-extensions ... Availability of the Intel® SHA Extensions on a particular processor can be determined by checking the SHA CPUID bit in CPUID.(EAX=07H, ECX=0):EBX.SHA [bit 29]. The following C function, using inline assembly, performs the CPUID check: -- int CheckForIntelShaExtensions() { int a, b, c, d; // Look for CPUID.7.0.EBX[29] // EAX = 7, ECX = 0 a = 7; c = 0; asm volatile ("cpuid" :"=a"(a), "=b"(b), "=c"(c), "=d"(d) :"a"(a), "c"(c) ); // Intel® SHA Extensions feature bit is EBX[29] return ((b >> 29) & 1); } -- On CPU with sha_ni the program return "1". Otherwise it return "0". [Test Case] * Reproducible with Xenial/Zesty/Artful release. * Generated a checksum of a big file (e.g. 5GB file) with openssl $ time /usr/bin/openssl dgst -sha256 /var/tmp/5Gfile SHA256(/var/tmp/5Gfile)= 8d448d81521cbc1bfdc04dd199d448bd3c49374221007bd0846d8d39a70dd4f8 real 0m12.835s user 0m12.344s sys 0m0.484s * Openssl speed $ openssl speed sha1 Doing sha1 for 3s on 16 size blocks: 9969152 sha1's in 3.00s Doing sha1 for 3s on 64 size blocks: 8019164 sha1's in 3.00s Doing sha1 for 3s on 256 size blocks: 5254219 sha1's in 2.99s Doing sha1 for 3s on 1024 size blocks: 2217067 sha1's in 3.00s Doing sha1 for 3s on 8192 size blocks: 347842 sha1's in 3.00s OpenSSL 1.0.2g 1 Mar 2016 built on: reproducible build, date unspecified options:bn(64,64) rc4(8x,int) des(idx,cisc,16,int) aes(partial) idea(int) blowfish(idx) compiler: gcc -I. -I.. -I../include -DOPENSSL_THREADS -D_REENTRANT -DDSO_DLFCN -DHAVE_DLFCN_H -Wa,--noexecstack -m64 -DL_ENDIAN -O3 -Wall -DOPENSSL_IA32_SSE2 -DOPENSSL_BN_ASM_MONT -DOPENSSL_BN_ASM_MONT5 -DOPENSSL_BN_ASM_GF2m -DSHA1_ASM -DSHA256_ASM -DSHA512_ASM -DMD5_ASM -DAES_ASM -DVPAES_ASM -DBSAES_ASM -DWHIRLPOOL_ASM -DGHASH_ASM -DECP_NISTZ256_ASM The 'numbers' are in 1000s of bytes per second processed. type 16 bytes 64 bytes 256 bytes 1024 bytes 8192 bytes sha1 53168.81k 171075.50k 449859.55k 756758.87k 949840.55 The performance are clearly better when using the patch which take benefit of the sha extension. (See Regression Potential section for result with patch) [Regression Potential] * Note : IRC discussion with infinity : https://bugs.launchpad.net/ubuntu/xenial/+source/openssl/+bug/1674399/comments/8 * Note from irc discussion with apw and rbasak : https://bugs.launchpad.net/ubuntu/+source/openssl/+bug/1674399/comments/2 * It basically allow openssl to take benefit of sha extension potential (mostly performance-wise) now that new AMD cpu star
[Touch-packages] [Bug 1674399] Re: OpenSSL CPU detection for AMD Ryzen CPUs
zesty_openssl_lp1674399.debdiff ** Description changed: [Impact] * Context: AMD added support in their processors for SHA Extensions[1] (CPU flag: sha_ni[2]) starting with Ryzen[3] CPU. Note that Ryzen CPU come in 64bit only (Confirmed with AMD representative). Current OpenSSL version in Ryzens still calls SHA for SSSE3 routine as result a number of extensions were effectively masked on Ryzen and shows no improvement. [1] /proc/cpuinfo processor : 0 vendor_id : AuthenticAMD cpu family : 23 model : 1 model name : AMD Ryzen 5 1600 Six-Core Processor flags : fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush mmx fxsr sse sse2 ht syscall nx mmxext fxsr_opt pdpe1gb rdtscp lm constant_tsc rep_good nopl nonstop_tsc extd_apicid aperfmperf eagerfpu pni pclmulqdq monitor ssse3 fma cx16 sse 4_1 sse4_2 movbe popcnt aes xsave avx f16c rdrand lahf_lm cmp_legacy svm extapic cr8_legacy abm sse4a misalignsse 3dnowprefetch osvw skinit wdt tce topoext perfctr_core perfctr_nb bpext perfctr_l2 mwaitx hw_pstate vmmcall fsgsbase bmi1 avx2 smep bmi2 rdseed adx smap clflusho pt sha_ni xsaveopt xsavec xgetbv1 clzero arat npt lbrv svm_lock nrip_save tsc_scale vmcb_clean flushbyasid decodeassists pausefilter pfthreshold [2] - sha_ni: SHA1/SHA256 Instruction Extensions [3] - https://en.wikipedia.org/wiki/Ryzen ... All models support: x87, MMX, SSE, SSE2, SSE3, SSSE3, SSE4.1, SSE4.2, AES, CLMUL, AVX, AVX2, FMA, CVT16/F16C, ABM, BMI1, BMI2, SHA.[5] ... * Program to performs the CPUID check: Reference : https://software.intel.com/en-us/articles/intel-sha-extensions ... Availability of the Intel® SHA Extensions on a particular processor can be determined by checking the SHA CPUID bit in CPUID.(EAX=07H, ECX=0):EBX.SHA [bit 29]. The following C function, using inline assembly, performs the CPUID check: -- int CheckForIntelShaExtensions() { int a, b, c, d; // Look for CPUID.7.0.EBX[29] // EAX = 7, ECX = 0 a = 7; c = 0; asm volatile ("cpuid" :"=a"(a), "=b"(b), "=c"(c), "=d"(d) :"a"(a), "c"(c) ); // Intel® SHA Extensions feature bit is EBX[29] return ((b >> 29) & 1); } -- On CPU with sha_ni the program return "1". Otherwise it return "0". [Test Case] * Reproducible with Xenial/Zesty/Artful release. * Generated a checksum of a big file (e.g. 5GB file) with openssl $ time /usr/bin/openssl dgst -sha256 /var/tmp/5Gfile SHA256(/var/tmp/5Gfile)= 8d448d81521cbc1bfdc04dd199d448bd3c49374221007bd0846d8d39a70dd4f8 real 0m12.835s user 0m12.344s sys 0m0.484s * Openssl speed $ openssl speed sha1 Doing sha1 for 3s on 16 size blocks: 9969152 sha1's in 3.00s Doing sha1 for 3s on 64 size blocks: 8019164 sha1's in 3.00s Doing sha1 for 3s on 256 size blocks: 5254219 sha1's in 2.99s Doing sha1 for 3s on 1024 size blocks: 2217067 sha1's in 3.00s Doing sha1 for 3s on 8192 size blocks: 347842 sha1's in 3.00s OpenSSL 1.0.2g 1 Mar 2016 built on: reproducible build, date unspecified options:bn(64,64) rc4(8x,int) des(idx,cisc,16,int) aes(partial) idea(int) blowfish(idx) compiler: gcc -I. -I.. -I../include -DOPENSSL_THREADS -D_REENTRANT -DDSO_DLFCN -DHAVE_DLFCN_H -Wa,--noexecstack -m64 -DL_ENDIAN -O3 -Wall -DOPENSSL_IA32_SSE2 -DOPENSSL_BN_ASM_MONT -DOPENSSL_BN_ASM_MONT5 -DOPENSSL_BN_ASM_GF2m -DSHA1_ASM -DSHA256_ASM -DSHA512_ASM -DMD5_ASM -DAES_ASM -DVPAES_ASM -DBSAES_ASM -DWHIRLPOOL_ASM -DGHASH_ASM -DECP_NISTZ256_ASM The 'numbers' are in 1000s of bytes per second processed. type 16 bytes 64 bytes 256 bytes 1024 bytes 8192 bytes sha1 53168.81k 171075.50k 449859.55k 756758.87k 949840.55 The performance are clearly better when using the patch which take benefit of the sha extension. (See Regression Potential section for result with patch) [Regression Potential] * Note : IRC discussion with infinity : https://bugs.launchpad.net/ubuntu/xenial/+source/openssl/+bug/1674399/comments/8 * Note from irc discussion with apw and rbasak : https://bugs.launchpad.net/ubuntu/+source/openssl/+bug/1674399/comments/2 * It basically allow openssl to take benefit of sha extension potential (mostly performance-wise) now that new AMD cpu starting to have the capability. * The code check the CPUID bit to determine if the sha instructions are available are not. * Maintainer comment proves that he did the successfully tested on Intel with/without SHA extension Reference: https://github.com/openssl/openssl/issues/2848 "I don't have access to Ryzen system, so I didn't test it explicitly on Ryzen. Reporter did confirm it tough. Myself I tested on Intel processors, yes, with/without." * LP reporter comment : - I, slashd, have tested on a Ryzen system (and AMD non-ryzen) and it does reveal a significant performance increase on Ryzen : - (Note that the p
[Touch-packages] [Bug 1674399] Re: OpenSSL CPU detection for AMD Ryzen CPUs
** Description changed: [Impact] * Context: AMD added support in their processors for SHA Extensions[1] (CPU flag: sha_ni[2]) starting with Ryzen[3] CPU. Note that Ryzen CPU come in 64bit only (Confirmed with AMD representative). Current OpenSSL version in Ryzens still calls SHA for SSSE3 routine as result a number of extensions were effectively masked on Ryzen and shows no improvement. [1] /proc/cpuinfo processor : 0 vendor_id : AuthenticAMD cpu family : 23 model : 1 model name : AMD Ryzen 5 1600 Six-Core Processor flags : fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush mmx fxsr sse sse2 ht syscall nx mmxext fxsr_opt pdpe1gb rdtscp lm constant_tsc rep_good nopl nonstop_tsc extd_apicid aperfmperf eagerfpu pni pclmulqdq monitor ssse3 fma cx16 sse 4_1 sse4_2 movbe popcnt aes xsave avx f16c rdrand lahf_lm cmp_legacy svm extapic cr8_legacy abm sse4a misalignsse 3dnowprefetch osvw skinit wdt tce topoext perfctr_core perfctr_nb bpext perfctr_l2 mwaitx hw_pstate vmmcall fsgsbase bmi1 avx2 smep bmi2 rdseed adx smap clflusho pt sha_ni xsaveopt xsavec xgetbv1 clzero arat npt lbrv svm_lock nrip_save tsc_scale vmcb_clean flushbyasid decodeassists pausefilter pfthreshold [2] - sha_ni: SHA1/SHA256 Instruction Extensions [3] - https://en.wikipedia.org/wiki/Ryzen ... All models support: x87, MMX, SSE, SSE2, SSE3, SSSE3, SSE4.1, SSE4.2, AES, CLMUL, AVX, AVX2, FMA, CVT16/F16C, ABM, BMI1, BMI2, SHA.[5] ... * Program to performs the CPUID check: Reference : https://software.intel.com/en-us/articles/intel-sha-extensions ... Availability of the Intel® SHA Extensions on a particular processor can be determined by checking the SHA CPUID bit in CPUID.(EAX=07H, ECX=0):EBX.SHA [bit 29]. The following C function, using inline assembly, performs the CPUID check: -- int CheckForIntelShaExtensions() { int a, b, c, d; // Look for CPUID.7.0.EBX[29] // EAX = 7, ECX = 0 a = 7; c = 0; asm volatile ("cpuid" :"=a"(a), "=b"(b), "=c"(c), "=d"(d) :"a"(a), "c"(c) ); // Intel® SHA Extensions feature bit is EBX[29] return ((b >> 29) & 1); } -- On CPU with sha_ni the program return "1". Otherwise it return "0". [Test Case] * Reproducible with Xenial/Zesty/Artful release. * Generated a checksum of a big file (e.g. 5GB file) with openssl $ time /usr/bin/openssl dgst -sha256 /var/tmp/5Gfile SHA256(/var/tmp/5Gfile)= 8d448d81521cbc1bfdc04dd199d448bd3c49374221007bd0846d8d39a70dd4f8 real 0m12.835s user 0m12.344s sys 0m0.484s * Openssl speed $ openssl speed sha1 Doing sha1 for 3s on 16 size blocks: 9969152 sha1's in 3.00s Doing sha1 for 3s on 64 size blocks: 8019164 sha1's in 3.00s Doing sha1 for 3s on 256 size blocks: 5254219 sha1's in 2.99s Doing sha1 for 3s on 1024 size blocks: 2217067 sha1's in 3.00s Doing sha1 for 3s on 8192 size blocks: 347842 sha1's in 3.00s OpenSSL 1.0.2g 1 Mar 2016 built on: reproducible build, date unspecified options:bn(64,64) rc4(8x,int) des(idx,cisc,16,int) aes(partial) idea(int) blowfish(idx) compiler: gcc -I. -I.. -I../include -DOPENSSL_THREADS -D_REENTRANT -DDSO_DLFCN -DHAVE_DLFCN_H -Wa,--noexecstack -m64 -DL_ENDIAN -O3 -Wall -DOPENSSL_IA32_SSE2 -DOPENSSL_BN_ASM_MONT -DOPENSSL_BN_ASM_MONT5 -DOPENSSL_BN_ASM_GF2m -DSHA1_ASM -DSHA256_ASM -DSHA512_ASM -DMD5_ASM -DAES_ASM -DVPAES_ASM -DBSAES_ASM -DWHIRLPOOL_ASM -DGHASH_ASM -DECP_NISTZ256_ASM The 'numbers' are in 1000s of bytes per second processed. type 16 bytes 64 bytes 256 bytes 1024 bytes 8192 bytes sha1 53168.81k 171075.50k 449859.55k 756758.87k 949840.55 The performance are clearly better when using the patch which take benefit of the sha extension. (See Regression Potential section for result with patch) [Regression Potential] - * Note : IRC discussion with infinity : + * Note : IRC discussion with infinity : https://bugs.launchpad.net/ubuntu/xenial/+source/openssl/+bug/1674399/comments/8 * Note from irc discussion with apw and rbasak : https://bugs.launchpad.net/ubuntu/+source/openssl/+bug/1674399/comments/2 * It basically allow openssl to take benefit of sha extension potential (mostly performance-wise) now that new AMD cpu starting to have the capability. * The code check the CPUID bit to determine if the sha instructions are available are not. - * Generated a checksum of a big file (e.g. 5GB file) with openssl + * Maintainer comment proves that he did the successfully tested on Intel + with/without SHA extension + + Reference: https://github.com/openssl/openssl/issues/2848 + "I don't have access to Ryzen system, so I didn't test it explicitly on Ryzen. Reporter did confirm it tough. Myself I tested on Intel processors, yes, with/without." + + * LP reporter comment : + I, slashd, have tested on a Ryzen system (and AMD non-ryzen) and it
[Touch-packages] [Bug 1674399] Re: OpenSSL CPU detection for AMD Ryzen CPUs
** Also affects: openssl (Ubuntu Yakkety) Importance: Undecided Status: New ** Changed in: openssl (Ubuntu Yakkety) Status: New => In Progress ** Changed in: openssl (Ubuntu Yakkety) Importance: Undecided => Medium ** Changed in: openssl (Ubuntu Yakkety) Assignee: (unassigned) => Eric Desrochers (slashd) -- You received this bug notification because you are a member of Ubuntu Touch seeded packages, which is subscribed to openssl in Ubuntu. https://bugs.launchpad.net/bugs/1674399 Title: OpenSSL CPU detection for AMD Ryzen CPUs Status in openssl package in Ubuntu: Fix Released Status in openssl source package in Xenial: In Progress Status in openssl source package in Yakkety: In Progress Status in openssl source package in Zesty: In Progress Status in openssl source package in Artful: Fix Released Bug description: [Impact] * Context: AMD added support in their processors for SHA Extensions[1] (CPU flag: sha_ni[2]) starting with Ryzen[3] CPU. Note that Ryzen CPU come in 64bit only (Confirmed with AMD representative). Current OpenSSL version in Ryzens still calls SHA for SSSE3 routine as result a number of extensions were effectively masked on Ryzen and shows no improvement. [1] /proc/cpuinfo processor : 0 vendor_id : AuthenticAMD cpu family : 23 model : 1 model name : AMD Ryzen 5 1600 Six-Core Processor flags : fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush mmx fxsr sse sse2 ht syscall nx mmxext fxsr_opt pdpe1gb rdtscp lm constant_tsc rep_good nopl nonstop_tsc extd_apicid aperfmperf eagerfpu pni pclmulqdq monitor ssse3 fma cx16 sse 4_1 sse4_2 movbe popcnt aes xsave avx f16c rdrand lahf_lm cmp_legacy svm extapic cr8_legacy abm sse4a misalignsse 3dnowprefetch osvw skinit wdt tce topoext perfctr_core perfctr_nb bpext perfctr_l2 mwaitx hw_pstate vmmcall fsgsbase bmi1 avx2 smep bmi2 rdseed adx smap clflusho pt sha_ni xsaveopt xsavec xgetbv1 clzero arat npt lbrv svm_lock nrip_save tsc_scale vmcb_clean flushbyasid decodeassists pausefilter pfthreshold [2] - sha_ni: SHA1/SHA256 Instruction Extensions [3] - https://en.wikipedia.org/wiki/Ryzen ... All models support: x87, MMX, SSE, SSE2, SSE3, SSSE3, SSE4.1, SSE4.2, AES, CLMUL, AVX, AVX2, FMA, CVT16/F16C, ABM, BMI1, BMI2, SHA.[5] ... * Program to performs the CPUID check: Reference : https://software.intel.com/en-us/articles/intel-sha-extensions ... Availability of the Intel® SHA Extensions on a particular processor can be determined by checking the SHA CPUID bit in CPUID.(EAX=07H, ECX=0):EBX.SHA [bit 29]. The following C function, using inline assembly, performs the CPUID check: -- int CheckForIntelShaExtensions() { int a, b, c, d; // Look for CPUID.7.0.EBX[29] // EAX = 7, ECX = 0 a = 7; c = 0; asm volatile ("cpuid" :"=a"(a), "=b"(b), "=c"(c), "=d"(d) :"a"(a), "c"(c) ); // Intel® SHA Extensions feature bit is EBX[29] return ((b >> 29) & 1); } -- On CPU with sha_ni the program return "1". Otherwise it return "0". [Test Case] * Reproducible with Xenial/Zesty/Artful release. * Generated a checksum of a big file (e.g. 5GB file) with openssl $ time /usr/bin/openssl dgst -sha256 /var/tmp/5Gfile SHA256(/var/tmp/5Gfile)= 8d448d81521cbc1bfdc04dd199d448bd3c49374221007bd0846d8d39a70dd4f8 real 0m12.835s user 0m12.344s sys 0m0.484s * Openssl speed $ openssl speed sha1 Doing sha1 for 3s on 16 size blocks: 9969152 sha1's in 3.00s Doing sha1 for 3s on 64 size blocks: 8019164 sha1's in 3.00s Doing sha1 for 3s on 256 size blocks: 5254219 sha1's in 2.99s Doing sha1 for 3s on 1024 size blocks: 2217067 sha1's in 3.00s Doing sha1 for 3s on 8192 size blocks: 347842 sha1's in 3.00s OpenSSL 1.0.2g 1 Mar 2016 built on: reproducible build, date unspecified options:bn(64,64) rc4(8x,int) des(idx,cisc,16,int) aes(partial) idea(int) blowfish(idx) compiler: gcc -I. -I.. -I../include -DOPENSSL_THREADS -D_REENTRANT -DDSO_DLFCN -DHAVE_DLFCN_H -Wa,--noexecstack -m64 -DL_ENDIAN -O3 -Wall -DOPENSSL_IA32_SSE2 -DOPENSSL_BN_ASM_MONT -DOPENSSL_BN_ASM_MONT5 -DOPENSSL_BN_ASM_GF2m -DSHA1_ASM -DSHA256_ASM -DSHA512_ASM -DMD5_ASM -DAES_ASM -DVPAES_ASM -DBSAES_ASM -DWHIRLPOOL_ASM -DGHASH_ASM -DECP_NISTZ256_ASM The 'numbers' are in 1000s of bytes per second processed. type 16 bytes 64 bytes 256 bytes 1024 bytes 8192 bytes sha1 53168.81k 171075.50k 449859.55k 756758.87k 949840.55 The performance are clearly better when using the patch which take benefit of the sha extension. (See Regression Potential section for result with patch) [Regression Potential] * Note : IRC discussion with infinity : https://bugs.launchpad.net/ubuntu/xenial/+source/openssl/+bug/1674399/comments/8 * Note from irc discussion with apw and rbasak : https://bugs.launchpad.net/ubuntu/+source/openssl/+bug/1674399/comments/2 * It b
[Touch-packages] [Bug 1674399] Re: OpenSSL CPU detection for AMD Ryzen CPUs
** Patch removed: "zesty_openssl_lp1674399.debdiff" https://bugs.launchpad.net/ubuntu/zesty/+source/openssl/+bug/1674399/+attachment/4868385/+files/zesty_openssl_lp1674399.debdiff -- You received this bug notification because you are a member of Ubuntu Touch seeded packages, which is subscribed to openssl in Ubuntu. https://bugs.launchpad.net/bugs/1674399 Title: OpenSSL CPU detection for AMD Ryzen CPUs Status in openssl package in Ubuntu: Fix Released Status in openssl source package in Xenial: In Progress Status in openssl source package in Zesty: In Progress Status in openssl source package in Artful: Fix Released Bug description: [Impact] * Context: AMD added support in their processors for SHA Extensions[1] (CPU flag: sha_ni[2]) starting with Ryzen[3] CPU. Note that Ryzen CPU come in 64bit only (Confirmed with AMD representative). Current OpenSSL version in Ryzens still calls SHA for SSSE3 routine as result a number of extensions were effectively masked on Ryzen and shows no improvement. [1] /proc/cpuinfo processor : 0 vendor_id : AuthenticAMD cpu family : 23 model : 1 model name : AMD Ryzen 5 1600 Six-Core Processor flags : fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush mmx fxsr sse sse2 ht syscall nx mmxext fxsr_opt pdpe1gb rdtscp lm constant_tsc rep_good nopl nonstop_tsc extd_apicid aperfmperf eagerfpu pni pclmulqdq monitor ssse3 fma cx16 sse 4_1 sse4_2 movbe popcnt aes xsave avx f16c rdrand lahf_lm cmp_legacy svm extapic cr8_legacy abm sse4a misalignsse 3dnowprefetch osvw skinit wdt tce topoext perfctr_core perfctr_nb bpext perfctr_l2 mwaitx hw_pstate vmmcall fsgsbase bmi1 avx2 smep bmi2 rdseed adx smap clflusho pt sha_ni xsaveopt xsavec xgetbv1 clzero arat npt lbrv svm_lock nrip_save tsc_scale vmcb_clean flushbyasid decodeassists pausefilter pfthreshold [2] - sha_ni: SHA1/SHA256 Instruction Extensions [3] - https://en.wikipedia.org/wiki/Ryzen ... All models support: x87, MMX, SSE, SSE2, SSE3, SSSE3, SSE4.1, SSE4.2, AES, CLMUL, AVX, AVX2, FMA, CVT16/F16C, ABM, BMI1, BMI2, SHA.[5] ... * Program to performs the CPUID check: Reference : https://software.intel.com/en-us/articles/intel-sha-extensions ... Availability of the Intel® SHA Extensions on a particular processor can be determined by checking the SHA CPUID bit in CPUID.(EAX=07H, ECX=0):EBX.SHA [bit 29]. The following C function, using inline assembly, performs the CPUID check: -- int CheckForIntelShaExtensions() { int a, b, c, d; // Look for CPUID.7.0.EBX[29] // EAX = 7, ECX = 0 a = 7; c = 0; asm volatile ("cpuid" :"=a"(a), "=b"(b), "=c"(c), "=d"(d) :"a"(a), "c"(c) ); // Intel® SHA Extensions feature bit is EBX[29] return ((b >> 29) & 1); } -- On CPU with sha_ni the program return "1". Otherwise it return "0". [Test Case] * Reproducible with Xenial/Zesty/Artful release. * Generated a checksum of a big file (e.g. 5GB file) with openssl $ time /usr/bin/openssl dgst -sha256 /var/tmp/5Gfile SHA256(/var/tmp/5Gfile)= 8d448d81521cbc1bfdc04dd199d448bd3c49374221007bd0846d8d39a70dd4f8 real 0m12.835s user 0m12.344s sys 0m0.484s * Openssl speed $ openssl speed sha1 Doing sha1 for 3s on 16 size blocks: 9969152 sha1's in 3.00s Doing sha1 for 3s on 64 size blocks: 8019164 sha1's in 3.00s Doing sha1 for 3s on 256 size blocks: 5254219 sha1's in 2.99s Doing sha1 for 3s on 1024 size blocks: 2217067 sha1's in 3.00s Doing sha1 for 3s on 8192 size blocks: 347842 sha1's in 3.00s OpenSSL 1.0.2g 1 Mar 2016 built on: reproducible build, date unspecified options:bn(64,64) rc4(8x,int) des(idx,cisc,16,int) aes(partial) idea(int) blowfish(idx) compiler: gcc -I. -I.. -I../include -DOPENSSL_THREADS -D_REENTRANT -DDSO_DLFCN -DHAVE_DLFCN_H -Wa,--noexecstack -m64 -DL_ENDIAN -O3 -Wall -DOPENSSL_IA32_SSE2 -DOPENSSL_BN_ASM_MONT -DOPENSSL_BN_ASM_MONT5 -DOPENSSL_BN_ASM_GF2m -DSHA1_ASM -DSHA256_ASM -DSHA512_ASM -DMD5_ASM -DAES_ASM -DVPAES_ASM -DBSAES_ASM -DWHIRLPOOL_ASM -DGHASH_ASM -DECP_NISTZ256_ASM The 'numbers' are in 1000s of bytes per second processed. type 16 bytes 64 bytes 256 bytes 1024 bytes 8192 bytes sha1 53168.81k 171075.50k 449859.55k 756758.87k 949840.55 The performance are clearly better when using the patch which take benefit of the sha extension. (See Regression Potential section for result with patch) [Regression Potential] * Note : IRC discussion with infinity : https://bugs.launchpad.net/ubuntu/xenial/+source/openssl/+bug/1674399/comments/8 * Note from irc discussion with apw and rbasak : https://bugs.launchpad.net/ubuntu/+source/openssl/+bug/1674399/comments/2 * It basically allow openssl to take benefit of sha extension potential (mostly performance-wise) now that new AMD cpu starting to have the capability. * The code check the CPUID bit to determine if the sha
[Touch-packages] [Bug 1674399] Re: OpenSSL CPU detection for AMD Ryzen CPUs
This bug was fixed in the package openssl - 1.0.2g-1ubuntu12 --- openssl (1.0.2g-1ubuntu12) artful; urgency=medium * crypto/x86*cpuid.pl: move extended feature detection. (LP: #1674399) This fix moves extended feature detection past basic feature detection where it belongs. 32-bit counterpart is harmonized too. -- Eric Desrochers Tue, 25 Apr 2017 18:16:18 -0400 ** Changed in: openssl (Ubuntu Artful) Status: Fix Committed => Fix Released -- You received this bug notification because you are a member of Ubuntu Touch seeded packages, which is subscribed to openssl in Ubuntu. https://bugs.launchpad.net/bugs/1674399 Title: OpenSSL CPU detection for AMD Ryzen CPUs Status in openssl package in Ubuntu: Fix Released Status in openssl source package in Xenial: In Progress Status in openssl source package in Zesty: In Progress Status in openssl source package in Artful: Fix Released Bug description: [Impact] * Context: AMD added support in their processors for SHA Extensions[1] (CPU flag: sha_ni[2]) starting with Ryzen[3] CPU. Note that Ryzen CPU come in 64bit only (Confirmed with AMD representative). Current OpenSSL version in Ryzens still calls SHA for SSSE3 routine as result a number of extensions were effectively masked on Ryzen and shows no improvement. [1] /proc/cpuinfo processor : 0 vendor_id : AuthenticAMD cpu family : 23 model : 1 model name : AMD Ryzen 5 1600 Six-Core Processor flags : fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush mmx fxsr sse sse2 ht syscall nx mmxext fxsr_opt pdpe1gb rdtscp lm constant_tsc rep_good nopl nonstop_tsc extd_apicid aperfmperf eagerfpu pni pclmulqdq monitor ssse3 fma cx16 sse 4_1 sse4_2 movbe popcnt aes xsave avx f16c rdrand lahf_lm cmp_legacy svm extapic cr8_legacy abm sse4a misalignsse 3dnowprefetch osvw skinit wdt tce topoext perfctr_core perfctr_nb bpext perfctr_l2 mwaitx hw_pstate vmmcall fsgsbase bmi1 avx2 smep bmi2 rdseed adx smap clflusho pt sha_ni xsaveopt xsavec xgetbv1 clzero arat npt lbrv svm_lock nrip_save tsc_scale vmcb_clean flushbyasid decodeassists pausefilter pfthreshold [2] - sha_ni: SHA1/SHA256 Instruction Extensions [3] - https://en.wikipedia.org/wiki/Ryzen ... All models support: x87, MMX, SSE, SSE2, SSE3, SSSE3, SSE4.1, SSE4.2, AES, CLMUL, AVX, AVX2, FMA, CVT16/F16C, ABM, BMI1, BMI2, SHA.[5] ... * Program to performs the CPUID check: Reference : https://software.intel.com/en-us/articles/intel-sha-extensions ... Availability of the Intel® SHA Extensions on a particular processor can be determined by checking the SHA CPUID bit in CPUID.(EAX=07H, ECX=0):EBX.SHA [bit 29]. The following C function, using inline assembly, performs the CPUID check: -- int CheckForIntelShaExtensions() { int a, b, c, d; // Look for CPUID.7.0.EBX[29] // EAX = 7, ECX = 0 a = 7; c = 0; asm volatile ("cpuid" :"=a"(a), "=b"(b), "=c"(c), "=d"(d) :"a"(a), "c"(c) ); // Intel® SHA Extensions feature bit is EBX[29] return ((b >> 29) & 1); } -- On CPU with sha_ni the program return "1". Otherwise it return "0". [Test Case] * Reproducible with Xenial/Zesty/Artful release. * Generated a checksum of a big file (e.g. 5GB file) with openssl $ time /usr/bin/openssl dgst -sha256 /var/tmp/5Gfile SHA256(/var/tmp/5Gfile)= 8d448d81521cbc1bfdc04dd199d448bd3c49374221007bd0846d8d39a70dd4f8 real 0m12.835s user 0m12.344s sys 0m0.484s * Openssl speed $ openssl speed sha1 Doing sha1 for 3s on 16 size blocks: 9969152 sha1's in 3.00s Doing sha1 for 3s on 64 size blocks: 8019164 sha1's in 3.00s Doing sha1 for 3s on 256 size blocks: 5254219 sha1's in 2.99s Doing sha1 for 3s on 1024 size blocks: 2217067 sha1's in 3.00s Doing sha1 for 3s on 8192 size blocks: 347842 sha1's in 3.00s OpenSSL 1.0.2g 1 Mar 2016 built on: reproducible build, date unspecified options:bn(64,64) rc4(8x,int) des(idx,cisc,16,int) aes(partial) idea(int) blowfish(idx) compiler: gcc -I. -I.. -I../include -DOPENSSL_THREADS -D_REENTRANT -DDSO_DLFCN -DHAVE_DLFCN_H -Wa,--noexecstack -m64 -DL_ENDIAN -O3 -Wall -DOPENSSL_IA32_SSE2 -DOPENSSL_BN_ASM_MONT -DOPENSSL_BN_ASM_MONT5 -DOPENSSL_BN_ASM_GF2m -DSHA1_ASM -DSHA256_ASM -DSHA512_ASM -DMD5_ASM -DAES_ASM -DVPAES_ASM -DBSAES_ASM -DWHIRLPOOL_ASM -DGHASH_ASM -DECP_NISTZ256_ASM The 'numbers' are in 1000s of bytes per second processed. type 16 bytes 64 bytes 256 bytes 1024 bytes 8192 bytes sha1 53168.81k 171075.50k 449859.55k 756758.87k 949840.55 The performance are clearly better when using the patch which take benefit of the sha extension. (See Regression Potential section for result with patch) [Regression Potential] * Note : IRC discussion with infinity : https://bugs.launchpad.net/ubuntu/xenial/+source/openssl/+bug/1674399/comments/8 * Note from irc discussion with apw and rbasak :
[Touch-packages] [Bug 1674399] Re: OpenSSL CPU detection for AMD Ryzen CPUs
** Changed in: openssl (Ubuntu Artful) Status: In Progress => Triaged ** Changed in: openssl (Ubuntu Artful) Status: Triaged => Fix Committed -- You received this bug notification because you are a member of Ubuntu Touch seeded packages, which is subscribed to openssl in Ubuntu. https://bugs.launchpad.net/bugs/1674399 Title: OpenSSL CPU detection for AMD Ryzen CPUs Status in openssl package in Ubuntu: Fix Committed Status in openssl source package in Xenial: In Progress Status in openssl source package in Zesty: In Progress Status in openssl source package in Artful: Fix Committed Bug description: [Impact] * Context: AMD added support in their processors for SHA Extensions[1] (CPU flag: sha_ni[2]) starting with Ryzen[3] CPU. Note that Ryzen CPU come in 64bit only (Confirmed with AMD representative). Current OpenSSL version in Ryzens still calls SHA for SSSE3 routine as result a number of extensions were effectively masked on Ryzen and shows no improvement. [1] /proc/cpuinfo processor : 0 vendor_id : AuthenticAMD cpu family : 23 model : 1 model name : AMD Ryzen 5 1600 Six-Core Processor flags : fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush mmx fxsr sse sse2 ht syscall nx mmxext fxsr_opt pdpe1gb rdtscp lm constant_tsc rep_good nopl nonstop_tsc extd_apicid aperfmperf eagerfpu pni pclmulqdq monitor ssse3 fma cx16 sse 4_1 sse4_2 movbe popcnt aes xsave avx f16c rdrand lahf_lm cmp_legacy svm extapic cr8_legacy abm sse4a misalignsse 3dnowprefetch osvw skinit wdt tce topoext perfctr_core perfctr_nb bpext perfctr_l2 mwaitx hw_pstate vmmcall fsgsbase bmi1 avx2 smep bmi2 rdseed adx smap clflusho pt sha_ni xsaveopt xsavec xgetbv1 clzero arat npt lbrv svm_lock nrip_save tsc_scale vmcb_clean flushbyasid decodeassists pausefilter pfthreshold [2] - sha_ni: SHA1/SHA256 Instruction Extensions [3] - https://en.wikipedia.org/wiki/Ryzen ... All models support: x87, MMX, SSE, SSE2, SSE3, SSSE3, SSE4.1, SSE4.2, AES, CLMUL, AVX, AVX2, FMA, CVT16/F16C, ABM, BMI1, BMI2, SHA.[5] ... * Program to performs the CPUID check: Reference : https://software.intel.com/en-us/articles/intel-sha-extensions ... Availability of the Intel® SHA Extensions on a particular processor can be determined by checking the SHA CPUID bit in CPUID.(EAX=07H, ECX=0):EBX.SHA [bit 29]. The following C function, using inline assembly, performs the CPUID check: -- int CheckForIntelShaExtensions() { int a, b, c, d; // Look for CPUID.7.0.EBX[29] // EAX = 7, ECX = 0 a = 7; c = 0; asm volatile ("cpuid" :"=a"(a), "=b"(b), "=c"(c), "=d"(d) :"a"(a), "c"(c) ); // Intel® SHA Extensions feature bit is EBX[29] return ((b >> 29) & 1); } -- On CPU with sha_ni the program return "1". Otherwise it return "0". [Test Case] * Reproducible with Xenial/Zesty/Artful release. * Generated a checksum of a big file (e.g. 5GB file) with openssl $ time /usr/bin/openssl dgst -sha256 /var/tmp/5Gfile SHA256(/var/tmp/5Gfile)= 8d448d81521cbc1bfdc04dd199d448bd3c49374221007bd0846d8d39a70dd4f8 real 0m12.835s user 0m12.344s sys 0m0.484s * Openssl speed $ openssl speed sha1 Doing sha1 for 3s on 16 size blocks: 9969152 sha1's in 3.00s Doing sha1 for 3s on 64 size blocks: 8019164 sha1's in 3.00s Doing sha1 for 3s on 256 size blocks: 5254219 sha1's in 2.99s Doing sha1 for 3s on 1024 size blocks: 2217067 sha1's in 3.00s Doing sha1 for 3s on 8192 size blocks: 347842 sha1's in 3.00s OpenSSL 1.0.2g 1 Mar 2016 built on: reproducible build, date unspecified options:bn(64,64) rc4(8x,int) des(idx,cisc,16,int) aes(partial) idea(int) blowfish(idx) compiler: gcc -I. -I.. -I../include -DOPENSSL_THREADS -D_REENTRANT -DDSO_DLFCN -DHAVE_DLFCN_H -Wa,--noexecstack -m64 -DL_ENDIAN -O3 -Wall -DOPENSSL_IA32_SSE2 -DOPENSSL_BN_ASM_MONT -DOPENSSL_BN_ASM_MONT5 -DOPENSSL_BN_ASM_GF2m -DSHA1_ASM -DSHA256_ASM -DSHA512_ASM -DMD5_ASM -DAES_ASM -DVPAES_ASM -DBSAES_ASM -DWHIRLPOOL_ASM -DGHASH_ASM -DECP_NISTZ256_ASM The 'numbers' are in 1000s of bytes per second processed. type 16 bytes 64 bytes 256 bytes 1024 bytes 8192 bytes sha1 53168.81k 171075.50k 449859.55k 756758.87k 949840.55 The performance are clearly better when using the patch which take benefit of the sha extension. (See Regression Potential section for result with patch) [Regression Potential] * Note : IRC discussion with infinity : https://bugs.launchpad.net/ubuntu/xenial/+source/openssl/+bug/1674399/comments/8 * Note from irc discussion with apw and rbasak : https://bugs.launchpad.net/ubuntu/+source/openssl/+bug/1674399/comments/2 * It basically allow openssl to take benefit of sha extension potential (mostly performance-wise) now that new AMD cpu starting to have the capability. * The code check the CPUID bit to determine if the sha instructions are availa
[Touch-packages] [Bug 1674399] Re: OpenSSL CPU detection for AMD Ryzen CPUs
xenial_openssl_lp1674399.debdiff ** Patch added: "xenial_openssl_lp1674399.debdiff" https://bugs.launchpad.net/ubuntu/xenial/+source/openssl/+bug/1674399/+attachment/4868386/+files/xenial_openssl_lp1674399.debdiff -- You received this bug notification because you are a member of Ubuntu Touch seeded packages, which is subscribed to openssl in Ubuntu. https://bugs.launchpad.net/bugs/1674399 Title: OpenSSL CPU detection for AMD Ryzen CPUs Status in openssl package in Ubuntu: In Progress Status in openssl source package in Xenial: In Progress Status in openssl source package in Zesty: In Progress Status in openssl source package in Artful: In Progress Bug description: [Impact] * Context: AMD added support in their processors for SHA Extensions[1] (CPU flag: sha_ni[2]) starting with Ryzen[3] CPU. Note that Ryzen CPU come in 64bit only (Confirmed with AMD representative). Current OpenSSL version in Ryzens still calls SHA for SSSE3 routine as result a number of extensions were effectively masked on Ryzen and shows no improvement. [1] /proc/cpuinfo processor : 0 vendor_id : AuthenticAMD cpu family : 23 model : 1 model name : AMD Ryzen 5 1600 Six-Core Processor flags : fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush mmx fxsr sse sse2 ht syscall nx mmxext fxsr_opt pdpe1gb rdtscp lm constant_tsc rep_good nopl nonstop_tsc extd_apicid aperfmperf eagerfpu pni pclmulqdq monitor ssse3 fma cx16 sse 4_1 sse4_2 movbe popcnt aes xsave avx f16c rdrand lahf_lm cmp_legacy svm extapic cr8_legacy abm sse4a misalignsse 3dnowprefetch osvw skinit wdt tce topoext perfctr_core perfctr_nb bpext perfctr_l2 mwaitx hw_pstate vmmcall fsgsbase bmi1 avx2 smep bmi2 rdseed adx smap clflusho pt sha_ni xsaveopt xsavec xgetbv1 clzero arat npt lbrv svm_lock nrip_save tsc_scale vmcb_clean flushbyasid decodeassists pausefilter pfthreshold [2] - sha_ni: SHA1/SHA256 Instruction Extensions [3] - https://en.wikipedia.org/wiki/Ryzen ... All models support: x87, MMX, SSE, SSE2, SSE3, SSSE3, SSE4.1, SSE4.2, AES, CLMUL, AVX, AVX2, FMA, CVT16/F16C, ABM, BMI1, BMI2, SHA.[5] ... * Program to performs the CPUID check: Reference : https://software.intel.com/en-us/articles/intel-sha-extensions ... Availability of the Intel® SHA Extensions on a particular processor can be determined by checking the SHA CPUID bit in CPUID.(EAX=07H, ECX=0):EBX.SHA [bit 29]. The following C function, using inline assembly, performs the CPUID check: -- int CheckForIntelShaExtensions() { int a, b, c, d; // Look for CPUID.7.0.EBX[29] // EAX = 7, ECX = 0 a = 7; c = 0; asm volatile ("cpuid" :"=a"(a), "=b"(b), "=c"(c), "=d"(d) :"a"(a), "c"(c) ); // Intel® SHA Extensions feature bit is EBX[29] return ((b >> 29) & 1); } -- On CPU with sha_ni the program return "1". Otherwise it return "0". [Test Case] * Reproducible with Xenial/Zesty/Artful release. * Generated a checksum of a big file (e.g. 5GB file) with openssl $ time /usr/bin/openssl dgst -sha256 /var/tmp/5Gfile SHA256(/var/tmp/5Gfile)= 8d448d81521cbc1bfdc04dd199d448bd3c49374221007bd0846d8d39a70dd4f8 real 0m12.835s user 0m12.344s sys 0m0.484s * Openssl speed $ openssl speed sha1 Doing sha1 for 3s on 16 size blocks: 9969152 sha1's in 3.00s Doing sha1 for 3s on 64 size blocks: 8019164 sha1's in 3.00s Doing sha1 for 3s on 256 size blocks: 5254219 sha1's in 2.99s Doing sha1 for 3s on 1024 size blocks: 2217067 sha1's in 3.00s Doing sha1 for 3s on 8192 size blocks: 347842 sha1's in 3.00s OpenSSL 1.0.2g 1 Mar 2016 built on: reproducible build, date unspecified options:bn(64,64) rc4(8x,int) des(idx,cisc,16,int) aes(partial) idea(int) blowfish(idx) compiler: gcc -I. -I.. -I../include -DOPENSSL_THREADS -D_REENTRANT -DDSO_DLFCN -DHAVE_DLFCN_H -Wa,--noexecstack -m64 -DL_ENDIAN -O3 -Wall -DOPENSSL_IA32_SSE2 -DOPENSSL_BN_ASM_MONT -DOPENSSL_BN_ASM_MONT5 -DOPENSSL_BN_ASM_GF2m -DSHA1_ASM -DSHA256_ASM -DSHA512_ASM -DMD5_ASM -DAES_ASM -DVPAES_ASM -DBSAES_ASM -DWHIRLPOOL_ASM -DGHASH_ASM -DECP_NISTZ256_ASM The 'numbers' are in 1000s of bytes per second processed. type 16 bytes 64 bytes 256 bytes 1024 bytes 8192 bytes sha1 53168.81k 171075.50k 449859.55k 756758.87k 949840.55 The performance are clearly better when using the patch which take benefit of the sha extension. (See Regression Potential section for result with patch) [Regression Potential] * Note : IRC discussion with infinity : https://bugs.launchpad.net/ubuntu/xenial/+source/openssl/+bug/1674399/comments/8 * Note from irc discussion with apw and rbasak : https://bugs.launchpad.net/ubuntu/+source/openssl/+bug/1674399/comments/2 * It basically allow openssl to take benefit of sha extension potential (mostly performance-wise) now that new AMD cpu starting to have the capability. * The code check the C
[Touch-packages] [Bug 1674399] Re: OpenSSL CPU detection for AMD Ryzen CPUs
zesty_openssl_lp1674399.debdiff ** Patch added: "zesty_openssl_lp1674399.debdiff" https://bugs.launchpad.net/ubuntu/xenial/+source/openssl/+bug/1674399/+attachment/4868385/+files/zesty_openssl_lp1674399.debdiff -- You received this bug notification because you are a member of Ubuntu Touch seeded packages, which is subscribed to openssl in Ubuntu. https://bugs.launchpad.net/bugs/1674399 Title: OpenSSL CPU detection for AMD Ryzen CPUs Status in openssl package in Ubuntu: In Progress Status in openssl source package in Xenial: In Progress Status in openssl source package in Zesty: In Progress Status in openssl source package in Artful: In Progress Bug description: [Impact] * Context: AMD added support in their processors for SHA Extensions[1] (CPU flag: sha_ni[2]) starting with Ryzen[3] CPU. Note that Ryzen CPU come in 64bit only (Confirmed with AMD representative). Current OpenSSL version in Ryzens still calls SHA for SSSE3 routine as result a number of extensions were effectively masked on Ryzen and shows no improvement. [1] /proc/cpuinfo processor : 0 vendor_id : AuthenticAMD cpu family : 23 model : 1 model name : AMD Ryzen 5 1600 Six-Core Processor flags : fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush mmx fxsr sse sse2 ht syscall nx mmxext fxsr_opt pdpe1gb rdtscp lm constant_tsc rep_good nopl nonstop_tsc extd_apicid aperfmperf eagerfpu pni pclmulqdq monitor ssse3 fma cx16 sse 4_1 sse4_2 movbe popcnt aes xsave avx f16c rdrand lahf_lm cmp_legacy svm extapic cr8_legacy abm sse4a misalignsse 3dnowprefetch osvw skinit wdt tce topoext perfctr_core perfctr_nb bpext perfctr_l2 mwaitx hw_pstate vmmcall fsgsbase bmi1 avx2 smep bmi2 rdseed adx smap clflusho pt sha_ni xsaveopt xsavec xgetbv1 clzero arat npt lbrv svm_lock nrip_save tsc_scale vmcb_clean flushbyasid decodeassists pausefilter pfthreshold [2] - sha_ni: SHA1/SHA256 Instruction Extensions [3] - https://en.wikipedia.org/wiki/Ryzen ... All models support: x87, MMX, SSE, SSE2, SSE3, SSSE3, SSE4.1, SSE4.2, AES, CLMUL, AVX, AVX2, FMA, CVT16/F16C, ABM, BMI1, BMI2, SHA.[5] ... * Program to performs the CPUID check: Reference : https://software.intel.com/en-us/articles/intel-sha-extensions ... Availability of the Intel® SHA Extensions on a particular processor can be determined by checking the SHA CPUID bit in CPUID.(EAX=07H, ECX=0):EBX.SHA [bit 29]. The following C function, using inline assembly, performs the CPUID check: -- int CheckForIntelShaExtensions() { int a, b, c, d; // Look for CPUID.7.0.EBX[29] // EAX = 7, ECX = 0 a = 7; c = 0; asm volatile ("cpuid" :"=a"(a), "=b"(b), "=c"(c), "=d"(d) :"a"(a), "c"(c) ); // Intel® SHA Extensions feature bit is EBX[29] return ((b >> 29) & 1); } -- On CPU with sha_ni the program return "1". Otherwise it return "0". [Test Case] * Reproducible with Xenial/Zesty/Artful release. * Generated a checksum of a big file (e.g. 5GB file) with openssl $ time /usr/bin/openssl dgst -sha256 /var/tmp/5Gfile SHA256(/var/tmp/5Gfile)= 8d448d81521cbc1bfdc04dd199d448bd3c49374221007bd0846d8d39a70dd4f8 real 0m12.835s user 0m12.344s sys 0m0.484s * Openssl speed $ openssl speed sha1 Doing sha1 for 3s on 16 size blocks: 9969152 sha1's in 3.00s Doing sha1 for 3s on 64 size blocks: 8019164 sha1's in 3.00s Doing sha1 for 3s on 256 size blocks: 5254219 sha1's in 2.99s Doing sha1 for 3s on 1024 size blocks: 2217067 sha1's in 3.00s Doing sha1 for 3s on 8192 size blocks: 347842 sha1's in 3.00s OpenSSL 1.0.2g 1 Mar 2016 built on: reproducible build, date unspecified options:bn(64,64) rc4(8x,int) des(idx,cisc,16,int) aes(partial) idea(int) blowfish(idx) compiler: gcc -I. -I.. -I../include -DOPENSSL_THREADS -D_REENTRANT -DDSO_DLFCN -DHAVE_DLFCN_H -Wa,--noexecstack -m64 -DL_ENDIAN -O3 -Wall -DOPENSSL_IA32_SSE2 -DOPENSSL_BN_ASM_MONT -DOPENSSL_BN_ASM_MONT5 -DOPENSSL_BN_ASM_GF2m -DSHA1_ASM -DSHA256_ASM -DSHA512_ASM -DMD5_ASM -DAES_ASM -DVPAES_ASM -DBSAES_ASM -DWHIRLPOOL_ASM -DGHASH_ASM -DECP_NISTZ256_ASM The 'numbers' are in 1000s of bytes per second processed. type 16 bytes 64 bytes 256 bytes 1024 bytes 8192 bytes sha1 53168.81k 171075.50k 449859.55k 756758.87k 949840.55 The performance are clearly better when using the patch which take benefit of the sha extension. (See Regression Potential section for result with patch) [Regression Potential] * Note : IRC discussion with infinity : https://bugs.launchpad.net/ubuntu/xenial/+source/openssl/+bug/1674399/comments/8 * Note from irc discussion with apw and rbasak : https://bugs.launchpad.net/ubuntu/+source/openssl/+bug/1674399/comments/2 * It basically allow openssl to take benefit of sha extension potential (mostly performance-wise) now that new AMD cpu starting to have the capability. * The code check the CPUI
[Touch-packages] [Bug 1674399] Re: OpenSSL CPU detection for AMD Ryzen CPUs
Here's the highligh of the discussion I had in #ubuntu-release with infinity about my proposal in comment #6. For SRU, I had a talk with apw and rbasak about this bug a couples weeks ago LP: #1674399, could you please look at this bug and based on the Descriptions and comment #6 if this looks eligible for SRU in Stable release ? (note that this is a HW enablement, not a bug, this is why I'm requesting you to have a look at it) thanks in advance. slashd: I disagree with your reasoning for not fixing both 64 and 32. slashd: Lots of people run 64/32 multiarch and would benefit from fixing both. infinity, I'm fine with fixing 32bit, I proposed that approach cause apw wanted to self-contained the fix as much as possible That doesn't really contain it much. ;) infinity, so what if I do the same proposition but including 32-bit in stable release, would that work for you ? slashd: Conceptually, I have no issues with the plan (other than the "please do 32-bit too" comment). infinity, sure, I'm actually glad you are keen to see the 32-bit portion included slashd: Upload away, IMO. infinity, I'll then start the upload for Artful, note that starting next week I'll be gone for 2 weeks for sprints, and won't be able to do much testing, so do you think it's preferable we only start the SRU when I get back or we upload this week and worst case it will languish in -proposed for ~2weeks which will allow ppls to test with no stress (if any volunteer) slashd: I think letting it fester in proposed for two weeks to see if we get random negative feedback is entirely fine. Obviously, I'll delete/revert it if it breaks anything, but I don't need you around for that. slashd: 2 weeks of random user installations plus you executing a more precision test plan should give us solid confidence. ** Description changed: [Impact] * Context: AMD added support in their processors for SHA Extensions[1] (CPU flag: sha_ni[2]) starting with Ryzen[3] CPU. Note that Ryzen CPU come in 64bit only (Confirmed with AMD representative). Current OpenSSL version in Ryzens still calls SHA for SSSE3 routine as result a number of extensions were effectively masked on Ryzen and shows no improvement. [1] /proc/cpuinfo processor : 0 vendor_id : AuthenticAMD cpu family : 23 model : 1 model name : AMD Ryzen 5 1600 Six-Core Processor flags : fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush mmx fxsr sse sse2 ht syscall nx mmxext fxsr_opt pdpe1gb rdtscp lm constant_tsc rep_good nopl nonstop_tsc extd_apicid aperfmperf eagerfpu pni pclmulqdq monitor ssse3 fma cx16 sse 4_1 sse4_2 movbe popcnt aes xsave avx f16c rdrand lahf_lm cmp_legacy svm extapic cr8_legacy abm sse4a misalignsse 3dnowprefetch osvw skinit wdt tce topoext perfctr_core perfctr_nb bpext perfctr_l2 mwaitx hw_pstate vmmcall fsgsbase bmi1 avx2 smep bmi2 rdseed adx smap clflusho pt sha_ni xsaveopt xsavec xgetbv1 clzero arat npt lbrv svm_lock nrip_save tsc_scale vmcb_clean flushbyasid decodeassists pausefilter pfthreshold [2] - sha_ni: SHA1/SHA256 Instruction Extensions [3] - https://en.wikipedia.org/wiki/Ryzen ... All models support: x87, MMX, SSE, SSE2, SSE3, SSSE3, SSE4.1, SSE4.2, AES, CLMUL, AVX, AVX2, FMA, CVT16/F16C, ABM, BMI1, BMI2, SHA.[5] ... * Program to performs the CPUID check: Reference : https://software.intel.com/en-us/articles/intel-sha-extensions ... Availability of the Intel® SHA Extensions on a particular processor can be determined by checking the SHA CPUID bit in CPUID.(EAX=07H, ECX=0):EBX.SHA [bit 29]. The following C function, using inline assembly, performs the CPUID check: -- int CheckForIntelShaExtensions() { int a, b, c, d; // Look for CPUID.7.0.EBX[29] // EAX = 7, ECX = 0 a = 7; c = 0; asm volatile ("cpuid" :"=a"(a), "=b"(b), "=c"(c), "=d"(d) :"a"(a), "c"(c) ); // Intel® SHA Extensions feature bit is EBX[29] return ((b >> 29) & 1); } -- On CPU with sha_ni the program return "1". Otherwise it return "0". [Test Case] * Reproducible with Xenial/Zesty/Artful release. * Generated a checksum of a big file (e.g. 5GB file) with openssl $ time /usr/bin/openssl dgst -sha256 /var/tmp/5Gfile SHA256(/var/tmp/5Gfile)= 8d448d81521cbc1bfdc04dd199d448bd3c49374221007bd0846d8d39a70dd4f8 real 0m12.835s user 0m12.344s sys 0m0.484s * Openssl speed $ openssl speed sha1 Doing sha1 for 3s on 16 size blocks: 9969152 sha1's in 3.00s Doing sha1 for 3s on 64 size blocks: 8019164 sha1's in 3.00s Doing sha1 for 3s on 256 size blocks: 5254219 sha1's in 2.99s Doing sha1 for 3s on 1024 size blocks: 2217067 sha1's in 3.00s Doing sha1 for 3s on 8192 size blocks: 347842 sha1's in 3.00s OpenSSL 1.0.2g 1 Mar 2016 built on: reproducible build, date unspecified options:bn(64,64) rc4(8x,int) des(idx,cisc,16,int) aes(partial) idea(
[Touch-packages] [Bug 1674399] Re: OpenSSL CPU detection for AMD Ryzen CPUs
** Tags added: patch -- You received this bug notification because you are a member of Ubuntu Touch seeded packages, which is subscribed to openssl in Ubuntu. https://bugs.launchpad.net/bugs/1674399 Title: OpenSSL CPU detection for AMD Ryzen CPUs Status in openssl package in Ubuntu: In Progress Status in openssl source package in Xenial: In Progress Status in openssl source package in Zesty: In Progress Status in openssl source package in Artful: In Progress Bug description: [Impact] * Context: AMD added support in their processors for SHA Extensions[1] (CPU flag: sha_ni[2]) starting with Ryzen[3] CPU. Note that Ryzen CPU come in 64bit only (Confirmed with AMD representative). Current OpenSSL version in Ryzens still calls SHA for SSSE3 routine as result a number of extensions were effectively masked on Ryzen and shows no improvement. [1] /proc/cpuinfo processor : 0 vendor_id : AuthenticAMD cpu family : 23 model : 1 model name : AMD Ryzen 5 1600 Six-Core Processor flags : fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush mmx fxsr sse sse2 ht syscall nx mmxext fxsr_opt pdpe1gb rdtscp lm constant_tsc rep_good nopl nonstop_tsc extd_apicid aperfmperf eagerfpu pni pclmulqdq monitor ssse3 fma cx16 sse 4_1 sse4_2 movbe popcnt aes xsave avx f16c rdrand lahf_lm cmp_legacy svm extapic cr8_legacy abm sse4a misalignsse 3dnowprefetch osvw skinit wdt tce topoext perfctr_core perfctr_nb bpext perfctr_l2 mwaitx hw_pstate vmmcall fsgsbase bmi1 avx2 smep bmi2 rdseed adx smap clflusho pt sha_ni xsaveopt xsavec xgetbv1 clzero arat npt lbrv svm_lock nrip_save tsc_scale vmcb_clean flushbyasid decodeassists pausefilter pfthreshold [2] - sha_ni: SHA1/SHA256 Instruction Extensions [3] - https://en.wikipedia.org/wiki/Ryzen ... All models support: x87, MMX, SSE, SSE2, SSE3, SSSE3, SSE4.1, SSE4.2, AES, CLMUL, AVX, AVX2, FMA, CVT16/F16C, ABM, BMI1, BMI2, SHA.[5] ... * Program to performs the CPUID check: Reference : https://software.intel.com/en-us/articles/intel-sha-extensions ... Availability of the Intel® SHA Extensions on a particular processor can be determined by checking the SHA CPUID bit in CPUID.(EAX=07H, ECX=0):EBX.SHA [bit 29]. The following C function, using inline assembly, performs the CPUID check: -- int CheckForIntelShaExtensions() { int a, b, c, d; // Look for CPUID.7.0.EBX[29] // EAX = 7, ECX = 0 a = 7; c = 0; asm volatile ("cpuid" :"=a"(a), "=b"(b), "=c"(c), "=d"(d) :"a"(a), "c"(c) ); // Intel® SHA Extensions feature bit is EBX[29] return ((b >> 29) & 1); } -- On CPU with sha_ni the program return "1". Otherwise it return "0". [Test Case] * Reproducible with Xenial/Zesty/Artful release. * Generated a checksum of a big file (e.g. 5GB file) with openssl $ time /usr/bin/openssl dgst -sha256 /var/tmp/5Gfile SHA256(/var/tmp/5Gfile)= 8d448d81521cbc1bfdc04dd199d448bd3c49374221007bd0846d8d39a70dd4f8 real 0m12.835s user 0m12.344s sys 0m0.484s * Openssl speed $ openssl speed sha1 Doing sha1 for 3s on 16 size blocks: 9969152 sha1's in 3.00s Doing sha1 for 3s on 64 size blocks: 8019164 sha1's in 3.00s Doing sha1 for 3s on 256 size blocks: 5254219 sha1's in 2.99s Doing sha1 for 3s on 1024 size blocks: 2217067 sha1's in 3.00s Doing sha1 for 3s on 8192 size blocks: 347842 sha1's in 3.00s OpenSSL 1.0.2g 1 Mar 2016 built on: reproducible build, date unspecified options:bn(64,64) rc4(8x,int) des(idx,cisc,16,int) aes(partial) idea(int) blowfish(idx) compiler: gcc -I. -I.. -I../include -DOPENSSL_THREADS -D_REENTRANT -DDSO_DLFCN -DHAVE_DLFCN_H -Wa,--noexecstack -m64 -DL_ENDIAN -O3 -Wall -DOPENSSL_IA32_SSE2 -DOPENSSL_BN_ASM_MONT -DOPENSSL_BN_ASM_MONT5 -DOPENSSL_BN_ASM_GF2m -DSHA1_ASM -DSHA256_ASM -DSHA512_ASM -DMD5_ASM -DAES_ASM -DVPAES_ASM -DBSAES_ASM -DWHIRLPOOL_ASM -DGHASH_ASM -DECP_NISTZ256_ASM The 'numbers' are in 1000s of bytes per second processed. type 16 bytes 64 bytes 256 bytes 1024 bytes 8192 bytes sha1 53168.81k 171075.50k 449859.55k 756758.87k 949840.55 The performance are clearly better when using the patch which take benefit of the sha extension. (See Regression Potential section for result with patch) [Regression Potential] * Note from irc discussion with apw and rbasak : [10:03:20] slashd, for me some new functionality like that is ok as long as it is very self-contained so easy to review and confirm is only used on the new h/w [10:03:52] one of our main goals is to avoid regressions [10:12:24] The SRU policy does explicitly permit hardware enablement in an LTS IIRC, though I'd expect ~ubuntu-sru to be involved in mitigating risk and making the final risk decision, FWIW. Full discussion: https://bugs.launchpad.net/ubuntu/+source/openssl/+bug/1674399/comments/2 * It basically allow openssl to take benefit of sha extension
[Touch-packages] [Bug 1674399] Re: OpenSSL CPU detection for AMD Ryzen CPUs
Attaching Artful debdiff ** Patch added: "artful_openssl_lp1674399.debdiff" https://bugs.launchpad.net/ubuntu/+source/openssl/+bug/1674399/+attachment/4868343/+files/artful_openssl_lp1674399.debdiff -- You received this bug notification because you are a member of Ubuntu Touch seeded packages, which is subscribed to openssl in Ubuntu. https://bugs.launchpad.net/bugs/1674399 Title: OpenSSL CPU detection for AMD Ryzen CPUs Status in openssl package in Ubuntu: In Progress Status in openssl source package in Xenial: In Progress Status in openssl source package in Zesty: In Progress Status in openssl source package in Artful: In Progress Bug description: [Impact] * Context: AMD added support in their processors for SHA Extensions[1] (CPU flag: sha_ni[2]) starting with Ryzen[3] CPU. Note that Ryzen CPU come in 64bit only (Confirmed with AMD representative). Current OpenSSL version in Ryzens still calls SHA for SSSE3 routine as result a number of extensions were effectively masked on Ryzen and shows no improvement. [1] /proc/cpuinfo processor : 0 vendor_id : AuthenticAMD cpu family : 23 model : 1 model name : AMD Ryzen 5 1600 Six-Core Processor flags : fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush mmx fxsr sse sse2 ht syscall nx mmxext fxsr_opt pdpe1gb rdtscp lm constant_tsc rep_good nopl nonstop_tsc extd_apicid aperfmperf eagerfpu pni pclmulqdq monitor ssse3 fma cx16 sse 4_1 sse4_2 movbe popcnt aes xsave avx f16c rdrand lahf_lm cmp_legacy svm extapic cr8_legacy abm sse4a misalignsse 3dnowprefetch osvw skinit wdt tce topoext perfctr_core perfctr_nb bpext perfctr_l2 mwaitx hw_pstate vmmcall fsgsbase bmi1 avx2 smep bmi2 rdseed adx smap clflusho pt sha_ni xsaveopt xsavec xgetbv1 clzero arat npt lbrv svm_lock nrip_save tsc_scale vmcb_clean flushbyasid decodeassists pausefilter pfthreshold [2] - sha_ni: SHA1/SHA256 Instruction Extensions [3] - https://en.wikipedia.org/wiki/Ryzen ... All models support: x87, MMX, SSE, SSE2, SSE3, SSSE3, SSE4.1, SSE4.2, AES, CLMUL, AVX, AVX2, FMA, CVT16/F16C, ABM, BMI1, BMI2, SHA.[5] ... * Program to performs the CPUID check: Reference : https://software.intel.com/en-us/articles/intel-sha-extensions ... Availability of the Intel® SHA Extensions on a particular processor can be determined by checking the SHA CPUID bit in CPUID.(EAX=07H, ECX=0):EBX.SHA [bit 29]. The following C function, using inline assembly, performs the CPUID check: -- int CheckForIntelShaExtensions() { int a, b, c, d; // Look for CPUID.7.0.EBX[29] // EAX = 7, ECX = 0 a = 7; c = 0; asm volatile ("cpuid" :"=a"(a), "=b"(b), "=c"(c), "=d"(d) :"a"(a), "c"(c) ); // Intel® SHA Extensions feature bit is EBX[29] return ((b >> 29) & 1); } -- On CPU with sha_ni the program return "1". Otherwise it return "0". [Test Case] * Reproducible with Xenial/Zesty/Artful release. * Generated a checksum of a big file (e.g. 5GB file) with openssl $ time /usr/bin/openssl dgst -sha256 /var/tmp/5Gfile SHA256(/var/tmp/5Gfile)= 8d448d81521cbc1bfdc04dd199d448bd3c49374221007bd0846d8d39a70dd4f8 real 0m12.835s user 0m12.344s sys 0m0.484s * Openssl speed $ openssl speed sha1 Doing sha1 for 3s on 16 size blocks: 9969152 sha1's in 3.00s Doing sha1 for 3s on 64 size blocks: 8019164 sha1's in 3.00s Doing sha1 for 3s on 256 size blocks: 5254219 sha1's in 2.99s Doing sha1 for 3s on 1024 size blocks: 2217067 sha1's in 3.00s Doing sha1 for 3s on 8192 size blocks: 347842 sha1's in 3.00s OpenSSL 1.0.2g 1 Mar 2016 built on: reproducible build, date unspecified options:bn(64,64) rc4(8x,int) des(idx,cisc,16,int) aes(partial) idea(int) blowfish(idx) compiler: gcc -I. -I.. -I../include -DOPENSSL_THREADS -D_REENTRANT -DDSO_DLFCN -DHAVE_DLFCN_H -Wa,--noexecstack -m64 -DL_ENDIAN -O3 -Wall -DOPENSSL_IA32_SSE2 -DOPENSSL_BN_ASM_MONT -DOPENSSL_BN_ASM_MONT5 -DOPENSSL_BN_ASM_GF2m -DSHA1_ASM -DSHA256_ASM -DSHA512_ASM -DMD5_ASM -DAES_ASM -DVPAES_ASM -DBSAES_ASM -DWHIRLPOOL_ASM -DGHASH_ASM -DECP_NISTZ256_ASM The 'numbers' are in 1000s of bytes per second processed. type 16 bytes 64 bytes 256 bytes 1024 bytes 8192 bytes sha1 53168.81k 171075.50k 449859.55k 756758.87k 949840.55 The performance are clearly better when using the patch which take benefit of the sha extension. (See Regression Potential section for result with patch) [Regression Potential] * Note from irc discussion with apw and rbasak : [10:03:20] slashd, for me some new functionality like that is ok as long as it is very self-contained so easy to review and confirm is only used on the new h/w [10:03:52] one of our main goals is to avoid regressions [10:12:24] The SRU policy does explicitly permit hardware enablement in an LTS IIRC, though I'd expect ~ubuntu-sru to be involved in mitigating risk and making the final
[Touch-packages] [Bug 1674399] Re: OpenSSL CPU detection for AMD Ryzen CPUs
[For SRU Verification team] Context : Previous IRC discussion with apw/rbasak about this case : https://bugs.launchpad.net/ubuntu/+source/openssl/+bug/1674399/comments/2 AMD released a new CPU called "Ryzen" that now support "Intel SHA extensions" technology. In current Ubuntu openssl package sha extension are masked on Ryzen CPU. Ryzen is available in 64-bit only CPU (Confimed with AMD representative). There are upstream patches that solve this situation and my test revealed that openssl has significant performance increase on a AMD CPU with sha extension capability. The upstream patches "f8418d8" & "1aed5e1" fix the situation by moving the extended feature detection from Label (.Lintel) to Label (.Lgeneric) in both 64-bit(crypto/x86_64cpuid.pl) & 32-bit (crypto/x86cpuid.pl) code where it should belongs now that non-intel CPU can also have the capability. I cannot strictly isolate the fix for Ryzen CPU only, meaning that this feature will enable sha extension for all CPUa that support the functionality, where prior this patch it was strictly reserved for Intel CPU only. Since I don't forsee a lot of users using the 32-bit package of opensssl on a 64-bit CPU with sha extension enabled and putting myself in a SRU mindset, I proposed the following : - SRU the 64-bit and 32-bit patch in development release (Artful/17.10) - SRU the 64 bit only in Stable release (Xenial/Zesty) - 32-bit code : Remain the same, thus no behavioural change - 64-bit code : Enable sha extension for 64-bit CPU that has the capabilities. The test that I proposed before "verification-done" (while package is in -proposed) -> Do a performance test using openssl speed[1] & generate a checksum on a big file[2] (e.g. 5GB) and capture the metric to compare before and after the patch on a 64-bit AMD system WITH sha extension capability (Ryzen). -> Do a performance test using openssl speed[1] & generate a checksum on a big file[2] (e.g. 5GB) and capture the metric to compare before and after the patch on a 64-bit AMD system WITHOUT sha extension capability. -> Do a performance test using openssl speed[1] & generate a checksum on a big file[2] (e.g. 5GB) and capture the metric to compare before and after the patch on a 64-bit Intel system WITH sha extension capability. -> Do a performance test using openssl speed[1] & generate a checksum on a big file[2] (e.g. 5GB) and capture the metric to compare before and after the patch on a 64-bit Intel system WITHOUT sha extension capability. -> In all the above tests, using perf record/annotate could also be done to validate if the sha instructions has been used or not depending on the above scenarios to conclude everything is working as expected. No extra testing will be needed for 32-bit package since no 32-bit code will be modified. On the other hand, if a user for some reason want to use openssl in 32-bit on a 64-bit with sha extension CPU capabilities, then this user won't have all the performance benefit of the 64-bit code fix. Considering that this request is a "HW enablement", thus not a bugfix and the above notes... Is this would be eligible for SRU in Stable Release ? [1] - Openssl speed measurement using sha1 $ openssl speed sha1 [2] - Measure time performance of generating a checksum of a big file (e.g. 5GB) $ time openssl dgst -sha256 /var/tmp/5Gfile -- You received this bug notification because you are a member of Ubuntu Touch seeded packages, which is subscribed to openssl in Ubuntu. https://bugs.launchpad.net/bugs/1674399 Title: OpenSSL CPU detection for AMD Ryzen CPUs Status in openssl package in Ubuntu: In Progress Status in openssl source package in Xenial: In Progress Status in openssl source package in Zesty: In Progress Status in openssl source package in Artful: In Progress Bug description: [Impact] * Context: AMD added support in their processors for SHA Extensions[1] (CPU flag: sha_ni[2]) starting with Ryzen[3] CPU. Note that Ryzen CPU come in 64bit only (Confirmed with AMD representative). Current OpenSSL version in Ryzens still calls SHA for SSSE3 routine as result a number of extensions were effectively masked on Ryzen and shows no improvement. [1] /proc/cpuinfo processor : 0 vendor_id : AuthenticAMD cpu family : 23 model : 1 model name : AMD Ryzen 5 1600 Six-Core Processor flags : fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush mmx fxsr sse sse2 ht syscall nx mmxext fxsr_opt pdpe1gb rdtscp lm constant_tsc rep_good nopl nonstop_tsc extd_apicid aperfmperf eagerfpu pni pclmulqdq monitor ssse3 fma cx16 sse 4_1 sse4_2 movbe popcnt aes xsave avx f16c rdrand lahf_lm cmp_legacy svm extapic cr8_legacy abm sse4a misalignsse 3dnowprefetch osvw skinit wdt tce topoext perfctr_core perfctr_nb bpext perfctr_l2 mwaitx hw_pstate vmmcall fsgsbase bmi1 avx2 smep bmi2 rdseed adx smap clflusho pt sha_ni xsaveopt xsavec xgetbv1 clzero arat npt lbrv svm_lock nrip_save tsc_scal
[Touch-packages] [Bug 1674399] Re: OpenSSL CPU detection for AMD Ryzen CPUs
** Description changed: [Impact] * Context: AMD added support in their processors for SHA Extensions[1] (CPU flag: sha_ni[2]) starting with Ryzen[3] CPU. Note that Ryzen CPU come in 64bit only (Confirmed with AMD representative). Current OpenSSL version in Ryzens still calls SHA for SSSE3 routine as result a number of extensions were effectively masked on Ryzen and shows no improvement. [1] /proc/cpuinfo processor : 0 vendor_id : AuthenticAMD cpu family : 23 model : 1 model name : AMD Ryzen 5 1600 Six-Core Processor flags : fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush mmx fxsr sse sse2 ht syscall nx mmxext fxsr_opt pdpe1gb rdtscp lm constant_tsc rep_good nopl nonstop_tsc extd_apicid aperfmperf eagerfpu pni pclmulqdq monitor ssse3 fma cx16 sse 4_1 sse4_2 movbe popcnt aes xsave avx f16c rdrand lahf_lm cmp_legacy svm extapic cr8_legacy abm sse4a misalignsse 3dnowprefetch osvw skinit wdt tce topoext perfctr_core perfctr_nb bpext perfctr_l2 mwaitx hw_pstate vmmcall fsgsbase bmi1 avx2 smep bmi2 rdseed adx smap clflusho pt sha_ni xsaveopt xsavec xgetbv1 clzero arat npt lbrv svm_lock nrip_save tsc_scale vmcb_clean flushbyasid decodeassists pausefilter pfthreshold [2] - sha_ni: SHA1/SHA256 Instruction Extensions [3] - https://en.wikipedia.org/wiki/Ryzen ... All models support: x87, MMX, SSE, SSE2, SSE3, SSSE3, SSE4.1, SSE4.2, AES, CLMUL, AVX, AVX2, FMA, CVT16/F16C, ABM, BMI1, BMI2, SHA.[5] ... * Program to performs the CPUID check: Reference : https://software.intel.com/en-us/articles/intel-sha-extensions ... Availability of the Intel® SHA Extensions on a particular processor can be determined by checking the SHA CPUID bit in CPUID.(EAX=07H, ECX=0):EBX.SHA [bit 29]. The following C function, using inline assembly, performs the CPUID check: -- int CheckForIntelShaExtensions() { int a, b, c, d; // Look for CPUID.7.0.EBX[29] // EAX = 7, ECX = 0 a = 7; c = 0; asm volatile ("cpuid" :"=a"(a), "=b"(b), "=c"(c), "=d"(d) :"a"(a), "c"(c) ); // Intel® SHA Extensions feature bit is EBX[29] return ((b >> 29) & 1); } -- On CPU with sha_ni the program return "1". Otherwise it return "0". [Test Case] * Reproducible with Xenial/Zesty/Artful release. * Generated a checksum of a big file (e.g. 5GB file) with openssl $ time /usr/bin/openssl dgst -sha256 /var/tmp/5Gfile SHA256(/var/tmp/5Gfile)= 8d448d81521cbc1bfdc04dd199d448bd3c49374221007bd0846d8d39a70dd4f8 real 0m12.835s user 0m12.344s sys 0m0.484s * Openssl speed $ openssl speed sha1 Doing sha1 for 3s on 16 size blocks: 9969152 sha1's in 3.00s Doing sha1 for 3s on 64 size blocks: 8019164 sha1's in 3.00s Doing sha1 for 3s on 256 size blocks: 5254219 sha1's in 2.99s Doing sha1 for 3s on 1024 size blocks: 2217067 sha1's in 3.00s Doing sha1 for 3s on 8192 size blocks: 347842 sha1's in 3.00s OpenSSL 1.0.2g 1 Mar 2016 built on: reproducible build, date unspecified options:bn(64,64) rc4(8x,int) des(idx,cisc,16,int) aes(partial) idea(int) blowfish(idx) compiler: gcc -I. -I.. -I../include -DOPENSSL_THREADS -D_REENTRANT -DDSO_DLFCN -DHAVE_DLFCN_H -Wa,--noexecstack -m64 -DL_ENDIAN -O3 -Wall -DOPENSSL_IA32_SSE2 -DOPENSSL_BN_ASM_MONT -DOPENSSL_BN_ASM_MONT5 -DOPENSSL_BN_ASM_GF2m -DSHA1_ASM -DSHA256_ASM -DSHA512_ASM -DMD5_ASM -DAES_ASM -DVPAES_ASM -DBSAES_ASM -DWHIRLPOOL_ASM -DGHASH_ASM -DECP_NISTZ256_ASM The 'numbers' are in 1000s of bytes per second processed. type 16 bytes 64 bytes 256 bytes 1024 bytes 8192 bytes sha1 53168.81k 171075.50k 449859.55k 756758.87k 949840.55 The performance are clearly better when using the patch which take benefit of the sha extension. (See Regression Potential section for result with patch) [Regression Potential] * Note from irc discussion with apw and rbasak : [10:03:20] slashd, for me some new functionality like that is ok as long as it is very self-contained so easy to review and confirm is only used on the new h/w [10:03:52] one of our main goals is to avoid regressions [10:12:24] The SRU policy does explicitly permit hardware enablement in an LTS IIRC, though I'd expect ~ubuntu-sru to be involved in mitigating risk and making the final risk decision, FWIW. Full discussion: https://bugs.launchpad.net/ubuntu/+source/openssl/+bug/1674399/comments/2 * It basically allow openssl to take benefit of sha extension potential (mostly performance-wise) now that new AMD cpu starting to have the capability. * The code check the CPUID bit to determine if the sha instructions are available are not. * Generated a checksum of a big file (e.g. 5GB file) with openssl $ time /usr/bin/openssl dgst -sha256 /var/tmp/5Gfile SHA256(/var/tmp/5Gfile)= 8d448d81521cbc1bfdc04dd199d448bd3c49374221007bd0846d8d39a70dd4f8
[Touch-packages] [Bug 1674399] Re: OpenSSL CPU detection for AMD Ryzen CPUs
** Description changed: [Impact] * Context: AMD added support in their processors for SHA Extensions[1] (CPU flag: sha_ni[2]) starting with Ryzen[3] CPU. Note that Ryzen CPU come in 64bit only (Confirmed with AMD representative). Current OpenSSL version in Ryzens still calls SHA for SSSE3 routine as result a number of extensions were effectively masked on Ryzen and shows no improvement. [1] /proc/cpuinfo processor : 0 vendor_id : AuthenticAMD cpu family : 23 model : 1 model name : AMD Ryzen 5 1600 Six-Core Processor flags : fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush mmx fxsr sse sse2 ht syscall nx mmxext fxsr_opt pdpe1gb rdtscp lm constant_tsc rep_good nopl nonstop_tsc extd_apicid aperfmperf eagerfpu pni pclmulqdq monitor ssse3 fma cx16 sse 4_1 sse4_2 movbe popcnt aes xsave avx f16c rdrand lahf_lm cmp_legacy svm extapic cr8_legacy abm sse4a misalignsse 3dnowprefetch osvw skinit wdt tce topoext perfctr_core perfctr_nb bpext perfctr_l2 mwaitx hw_pstate vmmcall fsgsbase bmi1 avx2 smep bmi2 rdseed adx smap clflusho pt sha_ni xsaveopt xsavec xgetbv1 clzero arat npt lbrv svm_lock nrip_save tsc_scale vmcb_clean flushbyasid decodeassists pausefilter pfthreshold [2] - sha_ni: SHA1/SHA256 Instruction Extensions [3] - https://en.wikipedia.org/wiki/Ryzen ... All models support: x87, MMX, SSE, SSE2, SSE3, SSSE3, SSE4.1, SSE4.2, AES, CLMUL, AVX, AVX2, FMA, CVT16/F16C, ABM, BMI1, BMI2, SHA.[5] ... * Program to performs the CPUID check: Reference : https://software.intel.com/en-us/articles/intel-sha-extensions ... Availability of the Intel® SHA Extensions on a particular processor can be determined by checking the SHA CPUID bit in CPUID.(EAX=07H, ECX=0):EBX.SHA [bit 29]. The following C function, using inline assembly, performs the CPUID check: -- int CheckForIntelShaExtensions() { int a, b, c, d; // Look for CPUID.7.0.EBX[29] // EAX = 7, ECX = 0 a = 7; c = 0; asm volatile ("cpuid" :"=a"(a), "=b"(b), "=c"(c), "=d"(d) :"a"(a), "c"(c) ); // Intel® SHA Extensions feature bit is EBX[29] return ((b >> 29) & 1); } -- On CPU with sha_ni the program return "1". Otherwise it return "0". [Test Case] * Reproducible with Xenial/Zesty/Artful release. * Generated a checksum of a big file (e.g. 5GB file) with openssl $ time /usr/bin/openssl dgst -sha256 /var/tmp/5Gfile SHA256(/var/tmp/5Gfile)= 8d448d81521cbc1bfdc04dd199d448bd3c49374221007bd0846d8d39a70dd4f8 real 0m12.835s user 0m12.344s sys 0m0.484s * Openssl speed $ openssl speed sha1 Doing sha1 for 3s on 16 size blocks: 9969152 sha1's in 3.00s Doing sha1 for 3s on 64 size blocks: 8019164 sha1's in 3.00s Doing sha1 for 3s on 256 size blocks: 5254219 sha1's in 2.99s Doing sha1 for 3s on 1024 size blocks: 2217067 sha1's in 3.00s Doing sha1 for 3s on 8192 size blocks: 347842 sha1's in 3.00s OpenSSL 1.0.2g 1 Mar 2016 built on: reproducible build, date unspecified options:bn(64,64) rc4(8x,int) des(idx,cisc,16,int) aes(partial) idea(int) blowfish(idx) compiler: gcc -I. -I.. -I../include -DOPENSSL_THREADS -D_REENTRANT -DDSO_DLFCN -DHAVE_DLFCN_H -Wa,--noexecstack -m64 -DL_ENDIAN -O3 -Wall -DOPENSSL_IA32_SSE2 -DOPENSSL_BN_ASM_MONT -DOPENSSL_BN_ASM_MONT5 -DOPENSSL_BN_ASM_GF2m -DSHA1_ASM -DSHA256_ASM -DSHA512_ASM -DMD5_ASM -DAES_ASM -DVPAES_ASM -DBSAES_ASM -DWHIRLPOOL_ASM -DGHASH_ASM -DECP_NISTZ256_ASM The 'numbers' are in 1000s of bytes per second processed. type 16 bytes 64 bytes 256 bytes 1024 bytes 8192 bytes sha1 53168.81k 171075.50k 449859.55k 756758.87k 949840.55 The performance are clearly better when using the patch which take benefit of the sha extension. (See Regression Potential section for result with patch) [Regression Potential] * Note from irc discussion with apw and rbasak : [10:03:20] slashd, for me some new functionality like that is ok as long as it is very self-contained so easy to review and confirm is only used on the new h/w [10:03:52] one of our main goals is to avoid regressions [10:12:24] The SRU policy does explicitly permit hardware enablement in an LTS IIRC, though I'd expect ~ubuntu-sru to be involved in mitigating risk and making the final risk decision, FWIW. - * None expected, it basically allow openssl to take benefit of sha - extension potential (mostly performance-wise) now that new AMD cpu - starting have the capability. + Full discussion: + https://bugs.launchpad.net/ubuntu/+source/openssl/+bug/1674399/comments/2 + + * It basically allow openssl to take benefit of sha extension potential + (mostly performance-wise) now that new AMD cpu starting to have the + capability. + + * The code check the CPUID bit to determine if the sha instructions are + available are not. * Generated a checksum of a big file (e.g. 5GB f
[Touch-packages] [Bug 1674399] Re: OpenSSL CPU detection for AMD Ryzen CPUs
Another test "Openssl speed" [Without patch] $ openssl speed sha1 Doing sha1 for 3s on 16 size blocks: 9969152 sha1's in 3.00s Doing sha1 for 3s on 64 size blocks: 8019164 sha1's in 3.00s Doing sha1 for 3s on 256 size blocks: 5254219 sha1's in 2.99s Doing sha1 for 3s on 1024 size blocks: 2217067 sha1's in 3.00s Doing sha1 for 3s on 8192 size blocks: 347842 sha1's in 3.00s OpenSSL 1.0.2g 1 Mar 2016 built on: reproducible build, date unspecified options:bn(64,64) rc4(8x,int) des(idx,cisc,16,int) aes(partial) idea(int) blowfish(idx) compiler: gcc -I. -I.. -I../include -DOPENSSL_THREADS -D_REENTRANT -DDSO_DLFCN -DHAVE_DLFCN_H -Wa,--noexecstack -m64 -DL_ENDIAN -O3 -Wall -DOPENSSL_IA32_SSE2 -DOPENSSL_BN_ASM_MONT -DOPENSSL_BN_ASM_MONT5 -DOPENSSL_BN_ASM_GF2m -DSHA1_ASM -DSHA256_ASM -DSHA512_ASM -DMD5_ASM -DAES_ASM -DVPAES_ASM -DBSAES_ASM -DWHIRLPOOL_ASM -DGHASH_ASM -DECP_NISTZ256_ASM The 'numbers' are in 1000s of bytes per second processed. type 16 bytes 64 bytes 256 bytes 1024 bytes 8192 bytes sha1 53168.81k 171075.50k 449859.55k 756758.87k 949840.55 [With patch] $ openssl speed sha1 Doing sha1 for 3s on 16 size blocks: 12081890 sha1's in 3.00s Doing sha1 for 3s on 64 size blocks: 11563950 sha1's in 3.00s Doing sha1 for 3s on 256 size blocks: 8375101 sha1's in 3.00s Doing sha1 for 3s on 1024 size blocks: 3987643 sha1's in 3.00s Doing sha1 for 3s on 8192 size blocks: 678036 sha1's in 3.00s OpenSSL 1.0.2g 1 Mar 2016 built on: reproducible build, date unspecified options:bn(64,64) rc4(8x,int) des(idx,cisc,16,int) aes(partial) idea(int) blowfish(idx) compiler: gcc -I. -I.. -I../include -DOPENSSL_THREADS -D_REENTRANT -DDSO_DLFCN -DHAVE_DLFCN_H -Wa,--noexecstack -m64 -DL_ENDIAN -O3 -Wall -DOPENSSL_IA32_SSE2 -DOPENSSL_BN_ASM_MONT -DOPENSSL_BN_ASM_MONT5 -DOPENSSL_BN_ASM_GF2m -DSHA1_ASM -DSHA256_ASM -DSHA512_ASM -DMD5_ASM -DAES_ASM -DVPAES_ASM -DBSAES_ASM -DWHIRLPOOL_ASM -DGHASH_ASM -DECP_NISTZ256_ASM The 'numbers' are in 1000s of bytes per second processed. type 16 bytes 64 bytes 256 bytes 1024 bytes 8192 bytes sha1 64436.75k 246697.60k 714675.29k 1361115.48k 1851490.30k ** Description changed: [Impact] * Context: AMD added support in their processors for SHA Extensions[1] (CPU flag: sha_ni[2]) starting with Ryzen[3] CPU. Note that Ryzen CPU come in 64bit only (Confirmed with AMD representative). Current OpenSSL version in Ryzens still calls SHA for SSSE3 routine as result a number of extensions were effectively masked on Ryzen and shows no improvement. [1] /proc/cpuinfo processor : 0 vendor_id : AuthenticAMD cpu family : 23 model : 1 model name : AMD Ryzen 5 1600 Six-Core Processor flags : fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush mmx fxsr sse sse2 ht syscall nx mmxext fxsr_opt pdpe1gb rdtscp lm constant_tsc rep_good nopl nonstop_tsc extd_apicid aperfmperf eagerfpu pni pclmulqdq monitor ssse3 fma cx16 sse 4_1 sse4_2 movbe popcnt aes xsave avx f16c rdrand lahf_lm cmp_legacy svm extapic cr8_legacy abm sse4a misalignsse 3dnowprefetch osvw skinit wdt tce topoext perfctr_core perfctr_nb bpext perfctr_l2 mwaitx hw_pstate vmmcall fsgsbase bmi1 avx2 smep bmi2 rdseed adx smap clflusho pt sha_ni xsaveopt xsavec xgetbv1 clzero arat npt lbrv svm_lock nrip_save tsc_scale vmcb_clean flushbyasid decodeassists pausefilter pfthreshold [2] - sha_ni: SHA1/SHA256 Instruction Extensions [3] - https://en.wikipedia.org/wiki/Ryzen ... All models support: x87, MMX, SSE, SSE2, SSE3, SSSE3, SSE4.1, SSE4.2, AES, CLMUL, AVX, AVX2, FMA, CVT16/F16C, ABM, BMI1, BMI2, SHA.[5] ... * Program to performs the CPUID check: Reference : https://software.intel.com/en-us/articles/intel-sha-extensions ... Availability of the Intel® SHA Extensions on a particular processor can be determined by checking the SHA CPUID bit in CPUID.(EAX=07H, ECX=0):EBX.SHA [bit 29]. The following C function, using inline assembly, performs the CPUID check: -- int CheckForIntelShaExtensions() { int a, b, c, d; // Look for CPUID.7.0.EBX[29] // EAX = 7, ECX = 0 a = 7; c = 0; asm volatile ("cpuid" :"=a"(a), "=b"(b), "=c"(c), "=d"(d) :"a"(a), "c"(c) ); // Intel® SHA Extensions feature bit is EBX[29] return ((b >> 29) & 1); } -- On CPU with sha_ni the program return "1". Otherwise it return "0". [Test Case] * Reproducible with Xenial/Zesty/Artful release. * Generated a checksum of a big file (e.g. 5GB file) with openssl $ time /usr/bin/openssl dgst -sha256 /var/tmp/5Gfile SHA256(/var/tmp/5Gfile)= 8d448d81521cbc1bfdc04dd199d448bd3c49374221007bd0846d8d39a70dd4f8 real 0m12.835s user 0m12.344s sys 0m0.484s + * Openssl speed + $ openssl speed sha1 + Doing sha1 for 3s on 16 size blocks: 9969152 sha1's in 3.00s + Doing sha1 for 3s on 64 size blocks: 8019164 sha1's in 3.00s + Doing sha1 for 3s on 256 size blocks: 5254219 sh
[Touch-packages] [Bug 1674399] Re: OpenSSL CPU detection for AMD Ryzen CPUs
So far my test reveal the following : # Note that the below test has been made on a Ryzen system # [Without patch] * Generated a checksum of a big file (e.g. 5GB file) with openssl $ time /usr/bin/openssl dgst -sha256 /var/tmp/5Gfile SHA256(/var/tmp/5Gfile)= 8d448d81521cbc1bfdc04dd199d448bd3c49374221007bd0846d8d39a70dd4f8 real0m12.835s user0m12.344s sys 0m0.484s [With patch] * Generated a checksum of a big file (e.g. 5GB file) with openssl $ time /usr/bin/openssl dgst -sha256 /var/tmp/5Gfile SHA256(/var/tmp/5Gfile)= 8d448d81521cbc1bfdc04dd199d448bd3c49374221007bd0846d8d39a70dd4f8 real0m3.471s user0m2.956s sys 0m0.516s ** Description changed: + [Impact] + + * Context: + + AMD added support in their processors for SHA Extensions[1] (CPU flag: + sha_ni[2]) starting with Ryzen[3] CPU. Note that Ryzen CPU come in 64bit + only (Confirmed with AMD representative). Current OpenSSL version in + Ryzens still calls SHA for SSSE3 routine as result a number of + extensions were effectively masked on Ryzen and shows no improvement. + + [1] /proc/cpuinfo + processor : 0 + vendor_id : AuthenticAMD + cpu family : 23 + model : 1 + model name : AMD Ryzen 5 1600 Six-Core Processor + flags : fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush mmx fxsr sse sse2 ht syscall nx mmxext fxsr_opt pdpe1gb rdtscp lm constant_tsc rep_good nopl nonstop_tsc extd_apicid aperfmperf eagerfpu pni pclmulqdq monitor ssse3 fma cx16 sse + 4_1 sse4_2 movbe popcnt aes xsave avx f16c rdrand lahf_lm cmp_legacy svm extapic cr8_legacy abm sse4a misalignsse 3dnowprefetch osvw skinit wdt tce topoext perfctr_core perfctr_nb bpext perfctr_l2 mwaitx hw_pstate vmmcall fsgsbase bmi1 avx2 smep bmi2 rdseed adx smap clflusho + pt sha_ni xsaveopt xsavec xgetbv1 clzero arat npt lbrv svm_lock nrip_save tsc_scale vmcb_clean flushbyasid decodeassists pausefilter pfthreshold + + [2] - sha_ni: SHA1/SHA256 Instruction Extensions + + [3] - https://en.wikipedia.org/wiki/Ryzen + ... + All models support: x87, MMX, SSE, SSE2, SSE3, SSSE3, SSE4.1, SSE4.2, AES, CLMUL, AVX, AVX2, FMA, CVT16/F16C, ABM, BMI1, BMI2, SHA.[5] + ... + + * Program to performs the CPUID check: + + Reference : + https://software.intel.com/en-us/articles/intel-sha-extensions + + ... Availability of the Intel® SHA Extensions on a particular processor + can be determined by checking the SHA CPUID bit in CPUID.(EAX=07H, + ECX=0):EBX.SHA [bit 29]. The following C function, using inline + assembly, performs the CPUID check: + + -- + int CheckForIntelShaExtensions() { +int a, b, c, d; + +// Look for CPUID.7.0.EBX[29] +// EAX = 7, ECX = 0 +a = 7; +c = 0; + +asm volatile ("cpuid" + :"=a"(a), "=b"(b), "=c"(c), "=d"(d) + :"a"(a), "c"(c) +); + +// Intel® SHA Extensions feature bit is EBX[29] +return ((b >> 29) & 1); + } + -- + + On CPU with sha_ni the program return "1". Otherwise it return "0". + + [Test Case] + + * Reproducible with Xenial/Zesty/Artful release. + + * Generated a checksum of a big file (e.g. 5GB file) with openssl + $ time /usr/bin/openssl dgst -sha256 /var/tmp/5Gfile + SHA256(/var/tmp/5Gfile)= 8d448d81521cbc1bfdc04dd199d448bd3c49374221007bd0846d8d39a70dd4f8 + + real 0m12.835s + user 0m12.344s + sys 0m0.484s + + + [Regression Potential] + + * None expected, it basically allow openssl to take benefit of sha + extension potential (mostly performance-wise) if AMD cpu has the + capability. + + * Generated a checksum of a big file (e.g. 5GB file) with openssl + $ time /usr/bin/openssl dgst -sha256 /var/tmp/5Gfile + SHA256(/var/tmp/5Gfile)= 8d448d81521cbc1bfdc04dd199d448bd3c49374221007bd0846d8d39a70dd4f8 + + real 0m3.471s + user 0m2.956s + sys 0m0.516s + + + [Other Info] + + * Debian Bug : https://bugs.debian.org/cgi-bin/bugreport.cgi?bug=861145 + + * Upstream Repository : https://github.com/openssl/openssl.git + + * Upstream Commits : + 1aed5e1 crypto/x86*cpuid.pl: move extended feature detection. + ## This fix moves extended feature detection past basic feature detection where it belongs. + + f8418d8 crypto/x86_64cpuid.pl: move extended feature detection upwards. + ## This commit for x86_64cpuid.pl addressed the problem, but messed up processor vendor detection. + + [Original Description] + * Context AMD added support in their processors for SHA Extensions[1] (CPU flag: sha_ni[2]) starting with Ryzen[3] CPU. Note that Ryzen CPU come in 64bit only (Confirmed with AMD representative). Current OpenSSL version in Ryzens still calls SHA for SSSE3 routine as result a number of extensions were effectively masked on Ryzen and shows no improvement. [1] /proc/cpuinfo processor : 0 vendor_id : AuthenticAMD cpu family: 23 model : 1 model name: AMD Ryzen 5 1600 Six-Core Processor flags : fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush
[Touch-packages] [Bug 1674399] Re: OpenSSL CPU detection for AMD Ryzen CPUs
Debian bug : https://bugs.debian.org/cgi-bin/bugreport.cgi?bug=861145 ** Bug watch added: Debian Bug tracker #861145 http://bugs.debian.org/cgi-bin/bugreport.cgi?bug=861145 -- You received this bug notification because you are a member of Ubuntu Touch seeded packages, which is subscribed to openssl in Ubuntu. https://bugs.launchpad.net/bugs/1674399 Title: OpenSSL CPU detection for AMD Ryzen CPUs Status in openssl package in Ubuntu: In Progress Status in openssl source package in Xenial: Triaged Status in openssl source package in Zesty: Triaged Status in openssl source package in Artful: In Progress Bug description: * Context AMD added support in their processors for SHA Extensions[1] (CPU flag: sha_ni[2]) starting with Ryzen[3] CPU. Note that Ryzen CPU come in 64bit only (Confirmed with AMD representative). Current OpenSSL version in Ryzens still calls SHA for SSSE3 routine as result a number of extensions were effectively masked on Ryzen and shows no improvement. [1] /proc/cpuinfo processor : 0 vendor_id : AuthenticAMD cpu family: 23 model : 1 model name: AMD Ryzen 5 1600 Six-Core Processor flags : fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush mmx fxsr sse sse2 ht syscall nx mmxext fxsr_opt pdpe1gb rdtscp lm constant_tsc rep_good nopl nonstop_tsc extd_apicid aperfmperf eagerfpu pni pclmulqdq monitor ssse3 fma cx16 sse 4_1 sse4_2 movbe popcnt aes xsave avx f16c rdrand lahf_lm cmp_legacy svm extapic cr8_legacy abm sse4a misalignsse 3dnowprefetch osvw skinit wdt tce topoext perfctr_core perfctr_nb bpext perfctr_l2 mwaitx hw_pstate vmmcall fsgsbase bmi1 avx2 smep bmi2 rdseed adx smap clflusho pt sha_ni xsaveopt xsavec xgetbv1 clzero arat npt lbrv svm_lock nrip_save tsc_scale vmcb_clean flushbyasid decodeassists pausefilter pfthreshold [2] - sha_ni: SHA1/SHA256 Instruction Extensions [3] - https://en.wikipedia.org/wiki/Ryzen ... All models support: x87, MMX, SSE, SSE2, SSE3, SSSE3, SSE4.1, SSE4.2, AES, CLMUL, AVX, AVX2, FMA, CVT16/F16C, ABM, BMI1, BMI2, SHA.[5] ... * Program to performs the CPUID check Reference : https://software.intel.com/en-us/articles/intel-sha-extensions ... Availability of the Intel® SHA Extensions on a particular processor can be determined by checking the SHA CPUID bit in CPUID.(EAX=07H, ECX=0):EBX.SHA [bit 29]. The following C function, using inline assembly, performs the CPUID check: -- int CheckForIntelShaExtensions() { int a, b, c, d; // Look for CPUID.7.0.EBX[29] // EAX = 7, ECX = 0 a = 7; c = 0; asm volatile ("cpuid" :"=a"(a), "=b"(b), "=c"(c), "=d"(d) :"a"(a), "c"(c) ); // Intel® SHA Extensions feature bit is EBX[29] return ((b >> 29) & 1); } -- On CPU with sha_ni the program return "1". Otherwise it return "0". * Upstream work: - Repository : https://github.com/openssl/openssl.git - Commits : 1aed5e1 crypto/x86*cpuid.pl: move extended feature detection. ## This fix moves extended feature detection past basic feature detection where it belongs. f8418d8 crypto/x86_64cpuid.pl: move extended feature detection upwards. ## This commit for x86_64cpuid.pl addressed the problem, but messed up processor vendor detection. To manage notifications about this bug go to: https://bugs.launchpad.net/ubuntu/+source/openssl/+bug/1674399/+subscriptions -- Mailing list: https://launchpad.net/~touch-packages Post to : touch-packages@lists.launchpad.net Unsubscribe : https://launchpad.net/~touch-packages More help : https://help.launchpad.net/ListHelp
[Touch-packages] [Bug 1674399] Re: OpenSSL CPU detection for AMD Ryzen CPUs
** Changed in: openssl (Ubuntu Xenial) Assignee: (unassigned) => Eric Desrochers (slashd) ** Changed in: openssl (Ubuntu Zesty) Assignee: (unassigned) => Eric Desrochers (slashd) ** Changed in: openssl (Ubuntu Artful) Assignee: (unassigned) => Eric Desrochers (slashd) ** Changed in: openssl (Ubuntu Artful) Status: Triaged => In Progress -- You received this bug notification because you are a member of Ubuntu Touch seeded packages, which is subscribed to openssl in Ubuntu. https://bugs.launchpad.net/bugs/1674399 Title: OpenSSL CPU detection for AMD Ryzen CPUs Status in openssl package in Ubuntu: In Progress Status in openssl source package in Xenial: Triaged Status in openssl source package in Zesty: Triaged Status in openssl source package in Artful: In Progress Bug description: * Context AMD added support in their processors for SHA Extensions[1] (CPU flag: sha_ni[2]) starting with Ryzen[3] CPU. Note that Ryzen CPU come in 64bit only (Confirmed with AMD representative). Current OpenSSL version in Ryzens still calls SHA for SSSE3 routine as result a number of extensions were effectively masked on Ryzen and shows no improvement. [1] /proc/cpuinfo processor : 0 vendor_id : AuthenticAMD cpu family: 23 model : 1 model name: AMD Ryzen 5 1600 Six-Core Processor flags : fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush mmx fxsr sse sse2 ht syscall nx mmxext fxsr_opt pdpe1gb rdtscp lm constant_tsc rep_good nopl nonstop_tsc extd_apicid aperfmperf eagerfpu pni pclmulqdq monitor ssse3 fma cx16 sse 4_1 sse4_2 movbe popcnt aes xsave avx f16c rdrand lahf_lm cmp_legacy svm extapic cr8_legacy abm sse4a misalignsse 3dnowprefetch osvw skinit wdt tce topoext perfctr_core perfctr_nb bpext perfctr_l2 mwaitx hw_pstate vmmcall fsgsbase bmi1 avx2 smep bmi2 rdseed adx smap clflusho pt sha_ni xsaveopt xsavec xgetbv1 clzero arat npt lbrv svm_lock nrip_save tsc_scale vmcb_clean flushbyasid decodeassists pausefilter pfthreshold [2] - sha_ni: SHA1/SHA256 Instruction Extensions [3] - https://en.wikipedia.org/wiki/Ryzen ... All models support: x87, MMX, SSE, SSE2, SSE3, SSSE3, SSE4.1, SSE4.2, AES, CLMUL, AVX, AVX2, FMA, CVT16/F16C, ABM, BMI1, BMI2, SHA.[5] ... * Program to performs the CPUID check Reference : https://software.intel.com/en-us/articles/intel-sha-extensions ... Availability of the Intel® SHA Extensions on a particular processor can be determined by checking the SHA CPUID bit in CPUID.(EAX=07H, ECX=0):EBX.SHA [bit 29]. The following C function, using inline assembly, performs the CPUID check: -- int CheckForIntelShaExtensions() { int a, b, c, d; // Look for CPUID.7.0.EBX[29] // EAX = 7, ECX = 0 a = 7; c = 0; asm volatile ("cpuid" :"=a"(a), "=b"(b), "=c"(c), "=d"(d) :"a"(a), "c"(c) ); // Intel® SHA Extensions feature bit is EBX[29] return ((b >> 29) & 1); } -- On CPU with sha_ni the program return "1". Otherwise it return "0". * Upstream work: - Repository : https://github.com/openssl/openssl.git - Commits : 1aed5e1 crypto/x86*cpuid.pl: move extended feature detection. ## This fix moves extended feature detection past basic feature detection where it belongs. f8418d8 crypto/x86_64cpuid.pl: move extended feature detection upwards. ## This commit for x86_64cpuid.pl addressed the problem, but messed up processor vendor detection. To manage notifications about this bug go to: https://bugs.launchpad.net/ubuntu/+source/openssl/+bug/1674399/+subscriptions -- Mailing list: https://launchpad.net/~touch-packages Post to : touch-packages@lists.launchpad.net Unsubscribe : https://launchpad.net/~touch-packages More help : https://help.launchpad.net/ListHelp
[Touch-packages] [Bug 1674399] Re: OpenSSL CPU detection for AMD Ryzen CPUs
** Description changed: * Context AMD added support in their processors for SHA Extensions[1] (CPU flag: sha_ni[2]) starting with Ryzen[3] CPU. Note that Ryzen CPU come in 64bit only (Confirmed with AMD representative). Current OpenSSL version in Ryzens still calls SHA for SSSE3 routine as result a number of extensions were effectively masked on Ryzen and shows no improvement. [1] /proc/cpuinfo processor : 0 vendor_id : AuthenticAMD cpu family: 23 model : 1 model name: AMD Ryzen 5 1600 Six-Core Processor flags : fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush mmx fxsr sse sse2 ht syscall nx mmxext fxsr_opt pdpe1gb rdtscp lm constant_tsc rep_good nopl nonstop_tsc extd_apicid aperfmperf eagerfpu pni pclmulqdq monitor ssse3 fma cx16 sse 4_1 sse4_2 movbe popcnt aes xsave avx f16c rdrand lahf_lm cmp_legacy svm extapic cr8_legacy abm sse4a misalignsse 3dnowprefetch osvw skinit wdt tce topoext perfctr_core perfctr_nb bpext perfctr_l2 mwaitx hw_pstate vmmcall fsgsbase bmi1 avx2 smep bmi2 rdseed adx smap clflusho pt sha_ni xsaveopt xsavec xgetbv1 clzero arat npt lbrv svm_lock nrip_save tsc_scale vmcb_clean flushbyasid decodeassists pausefilter pfthreshold [2] - sha_ni: SHA1/SHA256 Instruction Extensions [3] - https://en.wikipedia.org/wiki/Ryzen ... All models support: x87, MMX, SSE, SSE2, SSE3, SSSE3, SSE4.1, SSE4.2, AES, CLMUL, AVX, AVX2, FMA, CVT16/F16C, ABM, BMI1, BMI2, SHA.[5] ... * Program to performs the CPUID check Reference : https://software.intel.com/en-us/articles/intel-sha-extensions ... Availability of the Intel® SHA Extensions on a particular processor can be determined by checking the SHA CPUID bit in CPUID.(EAX=07H, ECX=0):EBX.SHA [bit 29]. The following C function, using inline assembly, performs the CPUID check: -- int CheckForIntelShaExtensions() { int a, b, c, d; // Look for CPUID.7.0.EBX[29] // EAX = 7, ECX = 0 a = 7; c = 0; asm volatile ("cpuid" :"=a"(a), "=b"(b), "=c"(c), "=d"(d) :"a"(a), "c"(c) ); // Intel® SHA Extensions feature bit is EBX[29] return ((b >> 29) & 1); } -- On CPU with sha_ni the program return "1". Otherwise it return "0". * Upstream work: - Repository : https://github.com/openssl/openssl.git - Commits : 1aed5e1 crypto/x86*cpuid.pl: move extended feature detection. - ## This commit for x86_64cpuid.pl addressed the problem, but messed up processor vendor detection. + ## This fix moves extended feature detection past basic feature detection where it belongs. f8418d8 crypto/x86_64cpuid.pl: move extended feature detection upwards. - ## This fix moves extended feature detection past basic feature detection where it belongs. + ## This commit for x86_64cpuid.pl addressed the problem, but messed up processor vendor detection. -- You received this bug notification because you are a member of Ubuntu Touch seeded packages, which is subscribed to openssl in Ubuntu. https://bugs.launchpad.net/bugs/1674399 Title: OpenSSL CPU detection for AMD Ryzen CPUs Status in openssl package in Ubuntu: Triaged Status in openssl source package in Xenial: Triaged Status in openssl source package in Zesty: Triaged Status in openssl source package in Artful: Triaged Bug description: * Context AMD added support in their processors for SHA Extensions[1] (CPU flag: sha_ni[2]) starting with Ryzen[3] CPU. Note that Ryzen CPU come in 64bit only (Confirmed with AMD representative). Current OpenSSL version in Ryzens still calls SHA for SSSE3 routine as result a number of extensions were effectively masked on Ryzen and shows no improvement. [1] /proc/cpuinfo processor : 0 vendor_id : AuthenticAMD cpu family: 23 model : 1 model name: AMD Ryzen 5 1600 Six-Core Processor flags : fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush mmx fxsr sse sse2 ht syscall nx mmxext fxsr_opt pdpe1gb rdtscp lm constant_tsc rep_good nopl nonstop_tsc extd_apicid aperfmperf eagerfpu pni pclmulqdq monitor ssse3 fma cx16 sse 4_1 sse4_2 movbe popcnt aes xsave avx f16c rdrand lahf_lm cmp_legacy svm extapic cr8_legacy abm sse4a misalignsse 3dnowprefetch osvw skinit wdt tce topoext perfctr_core perfctr_nb bpext perfctr_l2 mwaitx hw_pstate vmmcall fsgsbase bmi1 avx2 smep bmi2 rdseed adx smap clflusho pt sha_ni xsaveopt xsavec xgetbv1 clzero arat npt lbrv svm_lock nrip_save tsc_scale vmcb_clean flushbyasid decodeassists pausefilter pfthreshold [2] - sha_ni: SHA1/SHA256 Instruction Extensions [3] - https://en.wikipedia.org/wiki/Ryzen ... All models support: x87, MMX, SSE, SSE2, SSE3, SSSE3, SSE4.1, SSE4.2, AES, CLMUL, AVX, AVX2, FMA, CVT16/F16C, ABM, BMI1, BMI2, SHA.[5] ... * Program to performs the CPUID check Reference : https://software.intel.c
[Touch-packages] [Bug 1674399] Re: OpenSSL CPU detection for AMD Ryzen CPUs
** Description changed: * Context - AMD added support in their processors for SHA Extensions[1] / CPU flag: - sha_ni[2] starting with Ryzen CPU. Current OpenSSL version in Ryzens - still calls SHA for SSSE3 routine as result a number of extensions were - effectively masked on Ryzen and shows no improvement. + AMD added support in their processors for SHA Extensions[1] (CPU flag: + sha_ni[2]) starting with Ryzen[3] CPU. Note that Ryzen CPU come in 64bit + only (Confirmed with AMD representative). Current OpenSSL version in + Ryzens still calls SHA for SSSE3 routine as result a number of + extensions were effectively masked on Ryzen and shows no improvement. [1] /proc/cpuinfo processor : 0 vendor_id : AuthenticAMD cpu family: 23 model : 1 model name: AMD Ryzen 5 1600 Six-Core Processor flags : fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush mmx fxsr sse sse2 ht syscall nx mmxext fxsr_opt pdpe1gb rdtscp lm constant_tsc rep_good nopl nonstop_tsc extd_apicid aperfmperf eagerfpu pni pclmulqdq monitor ssse3 fma cx16 sse 4_1 sse4_2 movbe popcnt aes xsave avx f16c rdrand lahf_lm cmp_legacy svm extapic cr8_legacy abm sse4a misalignsse 3dnowprefetch osvw skinit wdt tce topoext perfctr_core perfctr_nb bpext perfctr_l2 mwaitx hw_pstate vmmcall fsgsbase bmi1 avx2 smep bmi2 rdseed adx smap clflusho pt sha_ni xsaveopt xsavec xgetbv1 clzero arat npt lbrv svm_lock nrip_save tsc_scale vmcb_clean flushbyasid decodeassists pausefilter pfthreshold [2] - sha_ni: SHA1/SHA256 Instruction Extensions + + [3] - https://en.wikipedia.org/wiki/Ryzen + ... + All models support: x87, MMX, SSE, SSE2, SSE3, SSSE3, SSE4.1, SSE4.2, AES, CLMUL, AVX, AVX2, FMA, CVT16/F16C, ABM, BMI1, BMI2, SHA.[5] + ... * Program to performs the CPUID check Reference : https://software.intel.com/en-us/articles/intel-sha-extensions -- int CheckForIntelShaExtensions() { int a, b, c, d; // Look for CPUID.7.0.EBX[29] // EAX = 7, ECX = 0 a = 7; c = 0; asm volatile ("cpuid" :"=a"(a), "=b"(b), "=c"(c), "=d"(d) :"a"(a), "c"(c) ); // Intel® SHA Extensions feature bit is EBX[29] return ((b >> 29) & 1); } -- On CPU with sha_ni the program return "1". Otherwise it return "0". * Upstream work: - Repository : https://github.com/openssl/openssl.git - Commits : 1aed5e1 crypto/x86*cpuid.pl: move extended feature detection. f8418d8 crypto/x86_64cpuid.pl: move extended feature detection upwards. ** Description changed: * Context AMD added support in their processors for SHA Extensions[1] (CPU flag: sha_ni[2]) starting with Ryzen[3] CPU. Note that Ryzen CPU come in 64bit only (Confirmed with AMD representative). Current OpenSSL version in Ryzens still calls SHA for SSSE3 routine as result a number of extensions were effectively masked on Ryzen and shows no improvement. [1] /proc/cpuinfo processor : 0 vendor_id : AuthenticAMD cpu family: 23 model : 1 model name: AMD Ryzen 5 1600 Six-Core Processor flags : fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush mmx fxsr sse sse2 ht syscall nx mmxext fxsr_opt pdpe1gb rdtscp lm constant_tsc rep_good nopl nonstop_tsc extd_apicid aperfmperf eagerfpu pni pclmulqdq monitor ssse3 fma cx16 sse 4_1 sse4_2 movbe popcnt aes xsave avx f16c rdrand lahf_lm cmp_legacy svm extapic cr8_legacy abm sse4a misalignsse 3dnowprefetch osvw skinit wdt tce topoext perfctr_core perfctr_nb bpext perfctr_l2 mwaitx hw_pstate vmmcall fsgsbase bmi1 avx2 smep bmi2 rdseed adx smap clflusho pt sha_ni xsaveopt xsavec xgetbv1 clzero arat npt lbrv svm_lock nrip_save tsc_scale vmcb_clean flushbyasid decodeassists pausefilter pfthreshold [2] - sha_ni: SHA1/SHA256 Instruction Extensions [3] - https://en.wikipedia.org/wiki/Ryzen ... All models support: x87, MMX, SSE, SSE2, SSE3, SSSE3, SSE4.1, SSE4.2, AES, CLMUL, AVX, AVX2, FMA, CVT16/F16C, ABM, BMI1, BMI2, SHA.[5] ... * Program to performs the CPUID check Reference : https://software.intel.com/en-us/articles/intel-sha-extensions -- int CheckForIntelShaExtensions() { int a, b, c, d; // Look for CPUID.7.0.EBX[29] // EAX = 7, ECX = 0 a = 7; c = 0; asm volatile ("cpuid" :"=a"(a), "=b"(b), "=c"(c), "=d"(d) :"a"(a), "c"(c) ); // Intel® SHA Extensions feature bit is EBX[29] return ((b >> 29) & 1); } -- On CPU with sha_ni the program return "1". Otherwise it return "0". * Upstream work: - Repository : https://github.com/openssl/openssl.git - Commits : - 1aed5e1 crypto/x86*cpuid.pl: move extended feature detection. - f8418d8 crypto/x86_64cpuid.pl: move extended feature detection upwards. + 1aed5e1 crypto/x86*cpuid.pl: move extended feature detection. --> This commit for
[Touch-packages] [Bug 1674399] Re: OpenSSL CPU detection for AMD Ryzen CPUs
** Description changed: - AMD added support in their processors for SHA Extensions starting with - Ryzen CPU. Current OpenSSL version in Ryzens still calls SHA for SSSE3 - routine as result a number of extensions were effectively masked on - Ryzen and shows no improvement. + * Context - It has been brought to my attention that : - "CPUID detection in OpenSSL does not properly detect potential optimizations for AMD processors." + AMD added support in their processors for SHA Extensions[1] starting + with Ryzen CPU. Current OpenSSL version in Ryzens still calls SHA for + SSSE3 routine as result a number of extensions were effectively masked + on Ryzen and shows no improvement. - After further verification on my side : + [1] /proc/cpuinfo + processor : 0 + vendor_id : AuthenticAMD + cpu family: 23 + model : 1 + model name: AMD Ryzen 5 1600 Six-Core Processor + flags : fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush mmx fxsr sse sse2 ht syscall nx mmxext fxsr_opt pdpe1gb rdtscp lm constant_tsc rep_good nopl nonstop_tsc extd_apicid aperfmperf eagerfpu pni pclmulqdq monitor ssse3 fma cx16 sse + 4_1 sse4_2 movbe popcnt aes xsave avx f16c rdrand lahf_lm cmp_legacy svm extapic cr8_legacy abm sse4a misalignsse 3dnowprefetch osvw skinit wdt tce topoext perfctr_core perfctr_nb bpext perfctr_l2 mwaitx hw_pstate vmmcall fsgsbase bmi1 avx2 smep bmi2 rdseed adx smap clflusho + pt sha_ni xsaveopt xsavec xgetbv1 clzero arat npt lbrv svm_lock nrip_save tsc_scale vmcb_clean flushbyasid decodeassists pausefilter pfthreshold - Extended feature flags were not pulled on AMD processors, as result a - number of extensions were effectively masked on Ryzen CPUs. It should - have been reported for Excavator since it implements AVX2 extension, but - apparently nobody noticed ... - The GitHub PR: - https://github.com/openssl/openssl/pull/2849 + * Program to performs the CPUID check + + Reference : + https://software.intel.com/en-us/articles/intel-sha-extensions + + -- + int CheckForIntelShaExtensions() { +int a, b, c, d; + +// Look for CPUID.7.0.EBX[29] +// EAX = 7, ECX = 0 +a = 7; +c = 0; + +asm volatile ("cpuid" + :"=a"(a), "=b"(b), "=c"(c), "=d"(d) + :"a"(a), "c"(c) +); + +// Intel® SHA Extensions feature bit is EBX[29] +return ((b >> 29) & 1); + } + -- + + * Upstream work: + + - Repository : https://github.com/openssl/openssl.git + - Commits : + + 1aed5e1 crypto/x86*cpuid.pl: move extended feature detection. + f8418d8 crypto/x86_64cpuid.pl: move extended feature detection upwards. ** Description changed: * Context - AMD added support in their processors for SHA Extensions[1] starting - with Ryzen CPU. Current OpenSSL version in Ryzens still calls SHA for - SSSE3 routine as result a number of extensions were effectively masked - on Ryzen and shows no improvement. + AMD added support in their processors for SHA Extensions[1] (sha_ni) + starting with Ryzen CPU. Current OpenSSL version in Ryzens still calls + SHA for SSSE3 routine as result a number of extensions were effectively + masked on Ryzen and shows no improvement. [1] /proc/cpuinfo processor : 0 vendor_id : AuthenticAMD cpu family: 23 model : 1 model name: AMD Ryzen 5 1600 Six-Core Processor flags : fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush mmx fxsr sse sse2 ht syscall nx mmxext fxsr_opt pdpe1gb rdtscp lm constant_tsc rep_good nopl nonstop_tsc extd_apicid aperfmperf eagerfpu pni pclmulqdq monitor ssse3 fma cx16 sse 4_1 sse4_2 movbe popcnt aes xsave avx f16c rdrand lahf_lm cmp_legacy svm extapic cr8_legacy abm sse4a misalignsse 3dnowprefetch osvw skinit wdt tce topoext perfctr_core perfctr_nb bpext perfctr_l2 mwaitx hw_pstate vmmcall fsgsbase bmi1 avx2 smep bmi2 rdseed adx smap clflusho pt sha_ni xsaveopt xsavec xgetbv1 clzero arat npt lbrv svm_lock nrip_save tsc_scale vmcb_clean flushbyasid decodeassists pausefilter pfthreshold - * Program to performs the CPUID check Reference : https://software.intel.com/en-us/articles/intel-sha-extensions -- int CheckForIntelShaExtensions() { -int a, b, c, d; + int a, b, c, d; -// Look for CPUID.7.0.EBX[29] -// EAX = 7, ECX = 0 -a = 7; -c = 0; + // Look for CPUID.7.0.EBX[29] + // EAX = 7, ECX = 0 + a = 7; + c = 0; -asm volatile ("cpuid" - :"=a"(a), "=b"(b), "=c"(c), "=d"(d) - :"a"(a), "c"(c) -); + asm volatile ("cpuid" + :"=a"(a), "=b"(b), "=c"(c), "=d"(d) + :"a"(a), "c"(c) + ); -// Intel® SHA Extensions feature bit is EBX[29] -return ((b >> 29) & 1); + // Intel® SHA Extensions feature bit is EBX[29] + return ((b >> 29) & 1); } -- * Upstream work: - Repository : https://github.com/openssl/openssl.git - Commits : 1aed5e1 crypto/x86*cpuid.pl: m
[Touch-packages] [Bug 1674399] Re: OpenSSL CPU detection for AMD Ryzen CPUs
** Also affects: openssl (Ubuntu Zesty) Importance: Undecided Status: New ** Also affects: openssl (Ubuntu Xenial) Importance: Undecided Status: New ** Changed in: openssl (Ubuntu Xenial) Status: New => Triaged ** Changed in: openssl (Ubuntu Zesty) Status: New => Triaged -- You received this bug notification because you are a member of Ubuntu Touch seeded packages, which is subscribed to openssl in Ubuntu. https://bugs.launchpad.net/bugs/1674399 Title: OpenSSL CPU detection for AMD Ryzen CPUs Status in openssl package in Ubuntu: Triaged Status in openssl source package in Xenial: Triaged Status in openssl source package in Zesty: Triaged Status in openssl source package in Artful: Triaged Bug description: AMD added support in their processors for SHA Extensions starting with Ryzen CPU. Current OpenSSL version in Ryzens still calls SHA for SSSE3 routine as result a number of extensions were effectively masked on Ryzen and shows no improvement. It has been brought to my attention that : "CPUID detection in OpenSSL does not properly detect potential optimizations for AMD processors." After further verification on my side : Extended feature flags were not pulled on AMD processors, as result a number of extensions were effectively masked on Ryzen CPUs. It should have been reported for Excavator since it implements AVX2 extension, but apparently nobody noticed ... The GitHub PR: https://github.com/openssl/openssl/pull/2849 To manage notifications about this bug go to: https://bugs.launchpad.net/ubuntu/+source/openssl/+bug/1674399/+subscriptions -- Mailing list: https://launchpad.net/~touch-packages Post to : touch-packages@lists.launchpad.net Unsubscribe : https://launchpad.net/~touch-packages More help : https://help.launchpad.net/ListHelp
[Touch-packages] [Bug 1674399] Re: OpenSSL CPU detection for AMD Ryzen CPUs
** Description changed: + AMD added support in their processors for SHA Extensions starting with + Ryzen CPU. Current OpenSSL version in Ryzens still calls SHA for SSSE3 + routine and shows no improvement. + It has been brought to my attention that : "CPUID detection in OpenSSL does not properly detect potential optimizations for AMD processors." After further verification on my side : Extended feature flags were not pulled on AMD processors, as result a number of extensions were effectively masked on Ryzen CPUs. It should have been reported for Excavator since it implements AVX2 extension, but apparently nobody noticed ... The GitHub PR: https://github.com/openssl/openssl/pull/2849 ** Description changed: AMD added support in their processors for SHA Extensions starting with Ryzen CPU. Current OpenSSL version in Ryzens still calls SHA for SSSE3 - routine and shows no improvement. + routine as result a number of extensions were effectively masked on + Ryzen and shows no improvement. It has been brought to my attention that : "CPUID detection in OpenSSL does not properly detect potential optimizations for AMD processors." After further verification on my side : Extended feature flags were not pulled on AMD processors, as result a number of extensions were effectively masked on Ryzen CPUs. It should have been reported for Excavator since it implements AVX2 extension, but apparently nobody noticed ... The GitHub PR: https://github.com/openssl/openssl/pull/2849 ** Also affects: openssl (Ubuntu Artful) Importance: Medium Status: Triaged -- You received this bug notification because you are a member of Ubuntu Touch seeded packages, which is subscribed to openssl in Ubuntu. https://bugs.launchpad.net/bugs/1674399 Title: OpenSSL CPU detection for AMD Ryzen CPUs Status in openssl package in Ubuntu: Triaged Status in openssl source package in Artful: Triaged Bug description: AMD added support in their processors for SHA Extensions starting with Ryzen CPU. Current OpenSSL version in Ryzens still calls SHA for SSSE3 routine as result a number of extensions were effectively masked on Ryzen and shows no improvement. It has been brought to my attention that : "CPUID detection in OpenSSL does not properly detect potential optimizations for AMD processors." After further verification on my side : Extended feature flags were not pulled on AMD processors, as result a number of extensions were effectively masked on Ryzen CPUs. It should have been reported for Excavator since it implements AVX2 extension, but apparently nobody noticed ... The GitHub PR: https://github.com/openssl/openssl/pull/2849 To manage notifications about this bug go to: https://bugs.launchpad.net/ubuntu/+source/openssl/+bug/1674399/+subscriptions -- Mailing list: https://launchpad.net/~touch-packages Post to : touch-packages@lists.launchpad.net Unsubscribe : https://launchpad.net/~touch-packages More help : https://help.launchpad.net/ListHelp
[Touch-packages] [Bug 1674399] Re: OpenSSL CPU detection for AMD Ryzen CPUs
** Changed in: openssl (Ubuntu) Assignee: Eric Desrochers (slashd) => (unassigned) -- You received this bug notification because you are a member of Ubuntu Touch seeded packages, which is subscribed to openssl in Ubuntu. https://bugs.launchpad.net/bugs/1674399 Title: OpenSSL CPU detection for AMD Ryzen CPUs Status in openssl package in Ubuntu: Triaged Bug description: It has been brought to my attention that : "CPUID detection in OpenSSL does not properly detect potential optimizations for AMD processors." After further verification on my side : Extended feature flags were not pulled on AMD processors, as result a number of extensions were effectively masked on Ryzen CPUs. It should have been reported for Excavator since it implements AVX2 extension, but apparently nobody noticed ... The GitHub PR: https://github.com/openssl/openssl/pull/2849 To manage notifications about this bug go to: https://bugs.launchpad.net/ubuntu/+source/openssl/+bug/1674399/+subscriptions -- Mailing list: https://launchpad.net/~touch-packages Post to : touch-packages@lists.launchpad.net Unsubscribe : https://launchpad.net/~touch-packages More help : https://help.launchpad.net/ListHelp
[Touch-packages] [Bug 1674399] Re: OpenSSL CPU detection for AMD Ryzen CPUs
Here's some context after a conversation about this bug on channel : #ubuntu-release ... [10:01:50] hi SRU, I'm currently working on a case (no LP bug yet) about an OpenSSL bug on new AMD CPU (Ryzen) released last Feb ... where the SHA Extension routine is not called on AMD Ryzen cores. My question is since this look like H/W enablement ... do you think this could be eligible for SRU in stable release such like Xenial ? or this will only be accepted for devel release ? This is a new CPU but Xenial is there for a couple of years still so maybe future Xenial user running Ryzen CPU may benefit on this eventually... [10:03:20] slashd, for me some new functionality like that is ok as long as it is very self-contained so easy to review and confirm is only used on the new h/w [10:03:52] one of our main goals is to avoid regressions [10:04:41] apw, make sense, thanks for your input [10:12:24] The SRU policy does explicitly permit hardware enablement in an LTS IIRC, though I'd expect ~ubuntu-sru to be involved in mitigating risk and making the final risk decision, FWIW. [10:16:11] rbasak, right, it would have to be carefully considered once we can see what the diff actually is [10:16:34] rbasak, apw ack, will communite the info with the proper group [10:16:50] with a much greater level of testing and scrutiny than a regular fix only sru [10:17:12] apw, rbasak, FYI I have requested the new CPU from our partner to test in deep ... ** Changed in: openssl (Ubuntu) Milestone: ubuntu-16.04.2 => None -- You received this bug notification because you are a member of Ubuntu Touch seeded packages, which is subscribed to openssl in Ubuntu. https://bugs.launchpad.net/bugs/1674399 Title: OpenSSL CPU detection for AMD Ryzen CPUs Status in openssl package in Ubuntu: Triaged Bug description: It has been brought to my attention that : "CPUID detection in OpenSSL does not properly detect potential optimizations for AMD processors." After further verification on my side : Extended feature flags were not pulled on AMD processors, as result a number of extensions were effectively masked on Ryzen CPUs. It should have been reported for Excavator since it implements AVX2 extension, but apparently nobody noticed ... The GitHub PR: https://github.com/openssl/openssl/pull/2849 To manage notifications about this bug go to: https://bugs.launchpad.net/ubuntu/+source/openssl/+bug/1674399/+subscriptions -- Mailing list: https://launchpad.net/~touch-packages Post to : touch-packages@lists.launchpad.net Unsubscribe : https://launchpad.net/~touch-packages More help : https://help.launchpad.net/ListHelp
[Touch-packages] [Bug 1674399] Re: OpenSSL CPU detection for AMD Ryzen CPUs
** Changed in: openssl (Ubuntu) Importance: Low => Medium ** Changed in: openssl (Ubuntu) Status: New => Triaged ** Changed in: openssl (Ubuntu) Milestone: None => ubuntu-16.04.2 ** Changed in: openssl (Ubuntu) Assignee: (unassigned) => Eric Desrochers (slashd) -- You received this bug notification because you are a member of Ubuntu Touch seeded packages, which is subscribed to openssl in Ubuntu. https://bugs.launchpad.net/bugs/1674399 Title: OpenSSL CPU detection for AMD Ryzen CPUs Status in openssl package in Ubuntu: Triaged Bug description: It has been brought to my attention that : "CPUID detection in OpenSSL does not properly detect potential optimizations for AMD processors." After further verification on my side : Extended feature flags were not pulled on AMD processors, as result a number of extensions were effectively masked on Ryzen CPUs. It should have been reported for Excavator since it implements AVX2 extension, but apparently nobody noticed ... The GitHub PR: https://github.com/openssl/openssl/pull/2849 To manage notifications about this bug go to: https://bugs.launchpad.net/ubuntu/+source/openssl/+bug/1674399/+subscriptions -- Mailing list: https://launchpad.net/~touch-packages Post to : touch-packages@lists.launchpad.net Unsubscribe : https://launchpad.net/~touch-packages More help : https://help.launchpad.net/ListHelp
[Touch-packages] [Bug 1674399] Re: OpenSSL CPU detection for AMD Ryzen CPUs
* Repository : https://github.com/openssl/openssl.git * Commits : 1aed5e1 crypto/x86*cpuid.pl: move extended feature detection. f8418d8 crypto/x86_64cpuid.pl: move extended feature detection upwards. - Eric -- You received this bug notification because you are a member of Ubuntu Touch seeded packages, which is subscribed to openssl in Ubuntu. https://bugs.launchpad.net/bugs/1674399 Title: OpenSSL CPU detection for AMD Ryzen CPUs Status in openssl package in Ubuntu: New Bug description: It has been brought to my attention that : "CPUID detection in OpenSSL does not properly detect potential optimizations for AMD processors." After further verification on my side : Extended feature flags were not pulled on AMD processors, as result a number of extensions were effectively masked on Ryzen CPUs. It should have been reported for Excavator since it implements AVX2 extension, but apparently nobody noticed ... The GitHub PR: https://github.com/openssl/openssl/pull/2849 To manage notifications about this bug go to: https://bugs.launchpad.net/ubuntu/+source/openssl/+bug/1674399/+subscriptions -- Mailing list: https://launchpad.net/~touch-packages Post to : touch-packages@lists.launchpad.net Unsubscribe : https://launchpad.net/~touch-packages More help : https://help.launchpad.net/ListHelp