> My mistake, it looks like my memory was wrong on two accounts. First, > it was AES, not SHA, where I observed the no-asm was faster. Second, it > was on the PowerPC cross-compiled target, not ARM. The results from > "openssl speed aes-128-cbc" are: > > type 16 bytes 64 bytes 256 bytes 1024 bytes 8192 > bytes > w/o no-asm 31010.47k 32988.82k 33549.41k 33693.05k > 33825.67k > no-asm 42431.46k 46485.14k 47479.20k 47874.86k > 47829.36k > > This is using a Freescale 8548.
This is no mystery at all, and kind of intentional. If you examine commentary in aes-ppc.pl you'll notice that that it relies on "compact" subroutines, those that are using 256-byte S-boxes, which require more computations. It mentions that "compact" encrypt is ~2 times slower than "traditional" encrypt. On the other side of scales is insecurity of "traditional" subroutine which is susceptible to cache-timing attacks. Well, it's not like "compact" is not susceptible, but it's *much* more resistant. Indeed, vulnerability is quantified by probability of a cache line not being accessed as result of block operation, and in "compact" case is as low as (1-32/256)^160=5e-10 vs. (1-4/256)^160=0.08 for processor in question. Note that C version is even worse than "non-compact" assembly subroutine. You might argue that there is no room for adversary in *your* application and performance should be favoured. By "no room" I mean that it's probably locked down embedded system and adversary having ability to execute own code is considered big enough problem. Yes, but you have to *argue* in favour. Maybe it should be a compile option... _______________________________________________ openssl-dev mailing list To unsubscribe: https://mta.openssl.org/mailman/listinfo/openssl-dev