Hi, > After the changes to DH requiring longer key lengths, I switched to 2048-bit > keys, but was finding this was now making my test runs on an embedded ARM9 > target annoyingly slow; so thought I'd investigate to see if there was > anything to improve. > > With some experimentation, it turns out that if I *stop* using the > crypto/bn/asm/bn/armv4-mont.pl generated asm "optimised" version, the time for > a simplish test to establish and close a simple SSL connection went from 28 > seconds to 18. (It's quite a slow target at any time). > > In other words, this "optimised" version has slowed things down dramatically. > Has anyone queried the value of the asm of armv4-mont.pl any time in the last > few years?
Yes, of course. For reference, here are speed rsa2048 dsa2048 results from Cortex-A8. Numbers are operations per second, so that higher is better. Without armv4-mont.pl: sign verify sign/s verify/s rsa 2048 bits 0.052684s 0.001421s 19.0 703.5 dsa 2048 bits 0.014576s 0.017526s 68.6 57.1 With armv4-mont.pl but without NEON (ARM SIMD extension): rsa 2048 bits 0.039255s 0.001140s 25.5 877.3 dsa 2048 bits 0.011630s 0.013900s 86.0 71.9 With armv4-mont.pl and NEON on: rsa 2048 bits 0.021053s 0.000606s 47.5 1650.2 dsa 2048 bits 0.006084s 0.006985s 164.4 143.2 Well, RSA/DSA are not DH, but they are very representative when it comes to sheer BIGNUM performance. And of course Cortex-A8 is not ARM9, but at least it shows that statement about armv4-mont.pl being bad for performance does not hold universally true. It's rather contrary, as similar picture can be observed on most ARM processors (well, all I tested). > Is it just that compilers have become better (I'm only using gcc > 4.7.3, so not bleeding edge even). I don't think so. BIGNUM performance can be delicate balance between multiple factors and it's not impossible to end up on the other side of breaking point. What breaking point? If you examine performance improvement with and without Montgomery multiplication module, you'll notice that there are processors on which improvement coefficient declines with key length. I mean you'll observe lower improvement longer key is. This indicates that there ought to be point past which you can as well observe worse performance, not better. So far such points fell outside practical key lengths on tested systems, ARM or not. Well, except for s390x-mont module [which by the way even discusses reasons for why such breaking point exists, see commentary in bn/asm/s390x-mont.pl]. In other words I argue that your case is case of finding yourself on the other side of said breaking point on specific CPU, not case of armv4-mont.pl being universally inferior. It does come a little bit unexpected in sense that I wouldn't expect it to hit the point at 2048-bit key length on any specific ARM processor, but on the other hard it's not impossible (all it takes is multiplication instruction stalling pipe-line for long enough to tip the balance). > Anyway, it's uncertain to me whether armv4-mont.pl should remain. Assuming that majority of ARM users are not ARM9 users, most would have to disagree :-) So what does it leave us? One can argue that OpenSSL could detect the breaking point at run-time and act accordingly, but it's tricky and is likely to have too narrow use. One can argue that OpenSSL can be further optimized so that breaking point is moved further (if not eliminated), which is more practical, because it should improve performance on all processors, but this is not something that happens over night. Meanwhile just documenting the case and providing instructions on how to disengage the module is probably reasonable compromise. Would you agree? One can make arrangements so that said instructions would be super-simple... > FYI, I couldn't discern any difference whether using armv4-gf2m or not, but > that doesn't mean it's bad. armv4-gf2m is involved in Elliptic Curve, and of specific kind. Your problem description doesn't sound like it should affect you. But even if it did, it's unlike that you'll notice regression, because there are no breaking points in that case. _______________________________________________ openssl-dev mailing list To unsubscribe: https://mta.openssl.org/mailman/listinfo/openssl-dev