>> Here is an updated version of the patch. >> >> Addressing a) "pointer to the function" (to select ADCX/ADOX) and b) >> multiple points addition >> >> There is (only) ~1% performance deterioration in due to the pointer being >> passed now, instead of (originally) being static. You can choose which >> style is preferable. >> > > Thanks! > > Alternatives would be (a) using a new lock for safe static initialization, > or (b) more code duplication to avoid the need for an explicit pointer > (there could be two separate implementations for the higher-level > routines). However, given the 1% performance penalty, that's a minor issue > at this point.
While if (functiona==NULL || functionb==NULL) { asssign functiona, functionb } can be unsafe, I'd argue that if (functiona==NULL) { assign functiona } followed by if (functionb) { assign functionb } is. > Do you have any comment from Intel on the concerns regarding the scattering > technique (http://cryptojedi.org/peter/data/chesrump-20130822.pdf)? As discussed off-list in this case the discrepancy is because so called memory disambiguation logic attempting to move loads ahead of stores, and failing when the least significant bits are same. Naturally load ought to be given "opportunity" to *try* to get ahead. I mean if there is enough "work done" between store and potentially conflicting load, then load won't "try" to get ahead of the store and variation . And indeed, if you add instructions to the test program the variation disappears. On Intel CPUs amount "worth" 5 cycles appears to be sufficient. For record, update for x86_64-mont modules is being prepared... ______________________________________________________________________ OpenSSL Project http://www.openssl.org Development Mailing List openssl-dev@openssl.org Automated List Manager majord...@openssl.org