Hi Andy, I would go with option 2. and detect the ARM architecture that we are compiling for in the configure phase and compile for it.
My gut feeling is that it is better not to have extra stuff in the binary. When you compile for the specific architecture, you get a tighter, more simple binary, with no extra unused execution paths. If the wrong ARM architecture code path were to somehow be executed, undefined behaviour could happen. SIGILL crashes, for example. Also the compiler can optimize the C parts for the architecture as well. I remember accidentally running a color conversion library written in C compiled for Cortex A15 on an A9, and got a crash with SIGILL. So the output is different. When I disassembled it, I could see it was using some new instruction combinations, compared with the A9 binary I then made. Anyhow, you are likely to see some performance increases in the C parts that are not written in assembly or neon-optimized when compiling for specific architectures. All the best, William On Fri, Oct 24, 2014 at 6:02 PM, Andy Polyakov <[email protected]> wrote: > There is inconsistency in ARM support and I'd like to gather some > opinions on how to resolve it. Circulate this to ARM people near you. > > At some point an inconsistency of following nature was introduced and > then just grew. OpenSSL attempts to adapt to processor it's running on > by detecting capabilities and using run-time switch between different > code paths. The code probing capabilities is compiled and executed on > all supported ARM architectures, ARMv4 through ARMv8. Original rationale > was that one should be able to produce "universal" binary that can be > executed on wide range of processors and deliver optimal performance on > all of them. But at the same time assembly modules have #if > __ARM_ARCH__>=X which effectively renders them not as universal as > implied in capability probing code. This is the inconsistency. There are > two ways to resolve it. > > 1. __ARM_ARCH__ is effectively controlled by compiler -march command > line option, which naturally also controls compiler-generated outcome. > In order to live up to original intention to produce "universal" binary, > it would be appropriate to tell *compiler* to generate code for minimal > architecture one wants to target, but tell *assembler* to accept > instructions for maximum architecture one wants to target, e.g. > -march=armv4 -Wa,-march=armv7-a. > > 2. Abandon the idea of producing true "universal" binary and limit > capability detection to contemporary processor families, ARMv7/8 for the > moment of this writing. And run pre-ARMv7 without capability detection > (there is nothing to detect really). > > I suppose distro vendors would prefer 2nd option, because everything has > to match anyway. ISV on the other hand might prefer 1st one, unless of > course they target specific distros one by one rather than providing > unified binaries that can be executed on multiple distros. > ______________________________________________________________________ > OpenSSL Project http://www.openssl.org > Development Mailing List [email protected] > Automated List Manager [email protected] ______________________________________________________________________ OpenSSL Project http://www.openssl.org Development Mailing List [email protected] Automated List Manager [email protected]
