Re: OpenSSL ASM patch
On Fri, Feb 16, 2001 at 11:58:50PM -0700, Warner Losh wrote: In message [EMAIL PROTECTED] Wes Peters writes: : Peter Jeremy wrote: : [1] I don't think there's a lot of `build once, install on lots of : different hardware', though I could be wrong. : : Most certainly wrong for those using FreeBSD for embedded devices. I, : for instance, build on nice, fast Athlons, then install in devices : ranging from 500 Mhz K6-2s all the way down to 266 Mhz Geodes. We can : always do multiple library builds and select the correct .so as we : build an install image for a particular platform, but it sure makes the : process longer and more error-prone. : : You should see sshd key generation crawl on a Geode 266 with no keyboard : to bang on. ;^) I do the build once, install many times all the time. I have a fast 4.2-stable box that I do the build on. I then do the installworld on a slower (ppro 200) -current box, my laptop, my router box, and other misc boxes in my house. Sometimes this build is current and other times it is -stable. I use it to make sure that the 4.x upgrade path to current works. :-) Well, these are both 686-class machines so it doesn't strictly apply to what we were talking about. Kris PGP signature
Re: OpenSSL ASM patch
In message [EMAIL PROTECTED] Kris Kennaway writes: : Well, these are both 686-class machines so it doesn't strictly apply : to what we were talking about. The build machine is a 686, but the targets are {486,586 and 686}. The laptop is a 586... Warner To Unsubscribe: send mail to [EMAIL PROTECTED] with "unsubscribe freebsd-current" in the body of the message
Re: OpenSSL ASM patch
I do plenty of build once and run on multiple machines. My biggest machine is a PII 40MHZ where I compile the world and kernels for a 486 laptop and P-60 Router/Firewall. I would not really want to compile the world on these slower machines over nfs. For my case, I guess I could rebuild only the ssl library for each machine properly tuned. For my small collection of machines, that wouldn't be too bad, but for larger sites it would be a problem. I could probably find some way to compile multiple version with buildworld and figure out the correct one to install with installworld. Jim Bloom [EMAIL PROTECTED] Peter Jeremy wrote: IMHO, the main market for this feature would be people who just do binary installs - if you're doing a buildworld, you can tune to your hardware[1]. If we wanted to just speed up OpenSSL on binary installs, we could have processor-optimised variants of libssl.* available as packages (tick the box that suits your processor if you want the optimised library). [1] I don't think there's a lot of `build once, install on lots of different hardware', though I could be wrong. To Unsubscribe: send mail to [EMAIL PROTECTED] with "unsubscribe freebsd-current" in the body of the message
Re: OpenSSL ASM patch
Jim Bloom [EMAIL PROTECTED] writes: I do plenty of build once and run on multiple machines. My biggest machine is a PII 40MHZ where I compile the world and kernels for a 486 laptop and P-60 Router/Firewall. I would not really want to compile the world on these slower machines over nfs. We also have a number of different machines for which we use a single kernel/userland compile, ranging from old K6's and PII's (and perhaps a few Pentium MMX's) to recent PIII's. There are large numbers of these machines, and no way to reasonably make variant kernels for them all. I'm sure other people running large numbers of servers accumulated over time have a similar problem (Yahoo?) So I'd add one vote for making to easy (or at least not hard) to include multiple architecture optimizations in one kernel/userland release, ala Solaris. -- Randell Jesup, Worldgate Communications, ex-Scala, ex-Amiga OS team ('88-94) [EMAIL PROTECTED] To Unsubscribe: send mail to [EMAIL PROTECTED] with "unsubscribe freebsd-current" in the body of the message
Re: OpenSSL ASM patch
Peter Jeremy wrote: [1] I don't think there's a lot of `build once, install on lots of different hardware', though I could be wrong. Most certainly wrong for those using FreeBSD for embedded devices. I, for instance, build on nice, fast Athlons, then install in devices ranging from 500 Mhz K6-2s all the way down to 266 Mhz Geodes. We can always do multiple library builds and select the correct .so as we build an install image for a particular platform, but it sure makes the process longer and more error-prone. You should see sshd key generation crawl on a Geode 266 with no keyboard to bang on. ;^) -- "Where am I, and what am I doing in this handbasket?" Wes Peters Softweyr LLC [EMAIL PROTECTED] http://softweyr.com/ To Unsubscribe: send mail to [EMAIL PROTECTED] with "unsubscribe freebsd-current" in the body of the message
Re: OpenSSL ASM patch
In message [EMAIL PROTECTED] Wes Peters writes: : Peter Jeremy wrote: : [1] I don't think there's a lot of `build once, install on lots of : different hardware', though I could be wrong. : : Most certainly wrong for those using FreeBSD for embedded devices. I, : for instance, build on nice, fast Athlons, then install in devices : ranging from 500 Mhz K6-2s all the way down to 266 Mhz Geodes. We can : always do multiple library builds and select the correct .so as we : build an install image for a particular platform, but it sure makes the : process longer and more error-prone. : : You should see sshd key generation crawl on a Geode 266 with no keyboard : to bang on. ;^) I do the build once, install many times all the time. I have a fast 4.2-stable box that I do the build on. I then do the installworld on a slower (ppro 200) -current box, my laptop, my router box, and other misc boxes in my house. Sometimes this build is current and other times it is -stable. I use it to make sure that the 4.x upgrade path to current works. :-) Warner To Unsubscribe: send mail to [EMAIL PROTECTED] with "unsubscribe freebsd-current" in the body of the message
Re: OpenSSL ASM patch
On 2001-Feb-11 13:02:43 -0800, Alfred Perlstein [EMAIL PROTECTED] wrote: * Kris Kennaway [EMAIL PROTECTED] [010211 12:52] wrote: On Sun, Feb 11, 2001 at 12:47:07PM -0800, Alfred Perlstein wrote: Is it possible to have multiple ASM cores and use the appropriate routines? Or must it all be choosen at compile time? It's done at compile-time. bah, lame. :( AFAIK, Solaris does this by (very roughly) having /usr/lib/libfoo.so depend on /usr/lib/machine/libfoo.so, where /usr/lib/machine is a symlink to the relevant set of architecture-specific libraries. The dynamic loading preferentially uses the machine-specific library. This means you get architecture-optimised routines with no additional overheads. I'm sure something similar would be possible with FreeBSD, but I don't have the expertise to actually implement it. I'm less certain how much of a win this would be in the general scheme of things: Apart from special cases (like OpenSSL), I don't think the libraries have a significant impact on overall performance. IMHO, the main market for this feature would be people who just do binary installs - if you're doing a buildworld, you can tune to your hardware[1]. If we wanted to just speed up OpenSSL on binary installs, we could have processor-optimised variants of libssl.* available as packages (tick the box that suits your processor if you want the optimised library). [1] I don't think there's a lot of `build once, install on lots of different hardware', though I could be wrong. Peter To Unsubscribe: send mail to [EMAIL PROTECTED] with "unsubscribe freebsd-current" in the body of the message
Re: OpenSSL ASM patch
On Fri, Feb 16, 2001 at 03:57:57PM +1100, Peter Jeremy wrote: I'm sure something similar would be possible with FreeBSD, but I don't have the expertise to actually implement it. I'm less certain how much of a win this would be in the general scheme of things: Apart from special cases (like OpenSSL), I don't think the libraries have a significant impact on overall performance. This would be quite doable, but I agree with you in thinking there aren't many people who would make use of it. If the kernel were to become dynamically tunable so e.g. GENERIC would dynamically select between the various CPU-specific asm optimizations, then there'd be more of a justification to making a generic userland self-tuning as well. IMHO, the main market for this feature would be people who just do binary installs - if you're doing a buildworld, you can tune to your hardware[1]. If we wanted to just speed up OpenSSL on binary installs, we could have processor-optimised variants of libssl.* available as packages (tick the box that suits your processor if you want the optimised library). If/when we ever get a packaged base system this would be a good and easy thing to do. We could do it now, but it wouldn't be natural in the sysinstall scheme of things (i.e. you'd have to install the OS, and then select the OpenSSL-i686 package from the listing of packages in the ports tree). Kris PGP signature
Re: OpenSSL ASM patch
Updated patch now available at the same location. Changes: * Document the MACHINE_CPU types which are currently used * Make NOPERL mutually exclusive with OpenSSL ASM and document it * Teach make(1) about MACHINE_CPU and provide sensible defaults for i386 and alpha. http://www.freebsd.org/~kris/openssl-asm.patch Kris PGP signature
Re: OpenSSL ASM patch
* Kris Kennaway [EMAIL PROTECTED] [010211 12:32] wrote: Updated patch now available at the same location. Changes: * Document the MACHINE_CPU types which are currently used * Make NOPERL mutually exclusive with OpenSSL ASM and document it * Teach make(1) about MACHINE_CPU and provide sensible defaults for i386 and alpha. http://www.freebsd.org/~kris/openssl-asm.patch Looks awesome, someone complained that Linux was able to maintain an order of magnitude more SSL connections than FreeBSD, since you say this gives us a 3-5x speed up, I'd really like to see it committed and ported to -stable ASAP. Is it possible to have multiple ASM cores and use the appropriate routines? Or must it all be choosen at compile time? -- -Alfred Perlstein - [[EMAIL PROTECTED]|[EMAIL PROTECTED]] "I have the heart of a child; I keep it in a jar on my desk." To Unsubscribe: send mail to [EMAIL PROTECTED] with "unsubscribe freebsd-current" in the body of the message
Re: OpenSSL ASM patch
On Sun, Feb 11, 2001 at 12:47:07PM -0800, Alfred Perlstein wrote: Looks awesome, someone complained that Linux was able to maintain an order of magnitude more SSL connections than FreeBSD, since you say this gives us a 3-5x speed up, I'd really like to see it committed and ported to -stable ASAP. Yep! Just want to give a few days for people to comment on the MACHINE_CPU thing. Is it possible to have multiple ASM cores and use the appropriate routines? Or must it all be choosen at compile time? It's done at compile-time. Kris PGP signature
Re: OpenSSL ASM patch
* Kris Kennaway [EMAIL PROTECTED] [010211 12:52] wrote: On Sun, Feb 11, 2001 at 12:47:07PM -0800, Alfred Perlstein wrote: Looks awesome, someone complained that Linux was able to maintain an order of magnitude more SSL connections than FreeBSD, since you say this gives us a 3-5x speed up, I'd really like to see it committed and ported to -stable ASAP. Yep! Just want to give a few days for people to comment on the MACHINE_CPU thing. Is it possible to have multiple ASM cores and use the appropriate routines? Or must it all be choosen at compile time? It's done at compile-time. bah, lame. :( How is the worst asm code vs the best C code again? -- -Alfred Perlstein - [[EMAIL PROTECTED]|[EMAIL PROTECTED]] "I have the heart of a child; I keep it in a jar on my desk." To Unsubscribe: send mail to [EMAIL PROTECTED] with "unsubscribe freebsd-current" in the body of the message
Re: OpenSSL ASM patch
On Sun, Feb 11, 2001 at 01:02:43PM -0800, Alfred Perlstein wrote: * Kris Kennaway [EMAIL PROTECTED] [010211 12:52] wrote: On Sun, Feb 11, 2001 at 12:47:07PM -0800, Alfred Perlstein wrote: Looks awesome, someone complained that Linux was able to maintain an order of magnitude more SSL connections than FreeBSD, since you say this gives us a 3-5x speed up, I'd really like to see it committed and ported to -stable ASAP. Yep! Just want to give a few days for people to comment on the MACHINE_CPU thing. Is it possible to have multiple ASM cores and use the appropriate routines? Or must it all be choosen at compile time? It's done at compile-time. bah, lame. :( How is the worst asm code vs the best C code again? OpenSSL includes 386 and 586 asm for the following: bf, bn (number libraries), cast, des, md5, rc4, rc5, ripemd, sha1. and 686 asm for bf only (DES is broken) In fact there's not a lot of difference between (what are claimed to be) the i386 versions and the i586 versions (they're generated from the same source by a preprocessor, and in fact are identical for some/most files) - this probably means they are not very optimal. I was also wrong about the speed improvements (they're not quite so high, only around 2x) - perhaps my baseline benchmark was sharing the CPU with something else giving it a 2x slowdown. So I'm not sure where the 3-5x speed up comes from - either it's another rumour (you didn't hear it from Peter again, did you? :) or the cause is elsewhere. What we build now should be exactly in line with what openssl does itself. These measurements were done on my PPro 233, and no attempt at sample averaging was performed :-) Kris [C code] type 8 bytes 64 bytes256 bytes 1024 bytes 8192 bytes md2159.52k 437.09k 590.38k 647.48k 653.41k mdc2 405.46k 440.38k 439.93k 442.00k 442.93k md4 2415.06k12806.00k24615.33k32313.88k35873.96k md5 1888.65k 9092.61k16840.50k20897.62k22739.51k hmac(md5) 741.81k 4722.98k11755.58k18427.53k22120.47k sha1 1319.27k 3052.54k 6990.83k10423.14k11986.67k rmd160 846.12k 3629.76k 6249.11k 7644.14k 8178.40k rc4 13176.13k17308.64k18127.45k18709.00k18527.01k des cbc 2589.75k 2911.96k 2918.99k 2930.14k 2961.85k des ede3 719.78k 751.80k 758.33k 758.61k 761.84k idea cbc 0.00 0.00 0.00 0.00 0.00 rc2 cbc 1476.49k 1540.60k 1551.96k 1547.98k 1571.13k rc5-32/12 cbc 6533.14k 8820.63k 9144.01k 9159.59k 9189.25k blowfish cbc 3921.72k 4490.54k 4551.53k 4567.12k 4582.91k cast cbc 3725.39k 4496.47k 4425.20k 4432.26k 4461.36k signverifysign/s verify/s rsa 512 bits 0.0106s 0.0011s 94.5951.4 rsa 1024 bits 0.0620s 0.0034s 16.1296.7 rsa 2048 bits 0.3963s 0.0112s 2.5 89.4 rsa 4096 bits 2.6106s 0.0389s 0.4 25.7 signverifysign/s verify/s dsa 512 bits 0.0109s 0.0134s 91.5 74.6 dsa 1024 bits 0.0342s 0.0406s 29.3 24.6 [i386] type 8 bytes 64 bytes256 bytes 1024 bytes 8192 bytes md5 2525.24k13682.82k26954.24k34031.00k38153.23k hmac(md5) 952.04k 6381.75k17338.03k29527.15k37320.02k sha1 1621.91k 6960.45k11626.82k13810.67k14685.97k rmd1601238.63k 5838.79k10350.12k12930.47k13941.05k rc4 18170.79k24351.64k25941.40k26300.99k26613.25k des cbc 4743.99k 5342.60k 5377.98k 5406.58k 5379.56k des ede3 1809.64k 1903.68k 1908.81k 1921.80k 1928.79k rc5-32/12 cbc11934.06k15701.79k16004.71k16014.24k16569.98k blowfish cbc 5885.08k 6493.90k 6553.44k 6575.91k 6569.06k cast cbc 5889.94k 6558.54k 6578.21k 6627.23k 6571.16k signverifysign/s verify/s rsa 512 bits 0.0057s 0.0005s174.2 1822.0 rsa 1024 bits 0.0299s 0.0016s 33.4641.4 rsa 2048 bits 0.1757s 0.0052s 5.7193.5 rsa 4096 bits 1.1865s 0.0179s 0.8 55.8 signverifysign/s verify/s dsa 512 bits 0.0057s 0.0068s176.4146.8 dsa 1024 bits 0.0157s 0.0185s 63.8 54.1 dsa 2048 bits 0.0503s 0.0621s 19.9 16.1 [i586] type 8 bytes 64 bytes256 bytes 1024 bytes 8192 bytes md5 2588.19k13504.14k26623.31k35248.59k38189.04k hmac(md5) 946.51k 6358.38k17134.34k29501.74k37456.10k
Re: OpenSSL ASM patch
On Sun, Feb 11, 2001 at 12:28:02PM -0800, Kris Kennaway wrote: Updated patch now available at the same location. Changes: * Document the MACHINE_CPU types which are currently used Actually, it occurs to me that this will be useful for ports as well. Currently some of them have nonstandard knobs like WANT_3DNOW, which could be easily covered by sticking an appropriate 'k6' (or whatever) into your MACHINE_CPU variable. Kris PGP signature