> > Cortex-M platforms are so limited that every bit of performance and > > space savings matters. So, I think it is definitely worthwhile to > > support the non-NEON ARMv7-M configuration. One easy way to do this > > would be to avoid building NEON code when __TARGET_PROFILE_M is defined. > > I don't see no __TARGET_PROFILE_M defined by gcc > > > I see. I didn't realize that GCC didn't emulate this ARM compiler > feature. Never mind.
But gcc defines __ARM_ARCH_7M__, which can be used to e.g. #if !defined(__TARGET_PROFILE_M) && defined(__ARM_ARCH_7M__) # define __TARGET_PROFILE_M #endif > Anyway, care to make a suggestion in form of patch? That > would be suitable even for gcc? [Just in case, no, I don't have ARM's > compiler, only its manual.] > > > I can try to make a patch to bring BoringSSL's OPENSSL_STATIC_ARMCAP > mechanism to OpenSSL, if you think that is an OK approach. I don't understand. Original question was about conditional *omission* of NEON code (which incidentally means even omission of run-time switches), while BoringSSL's OPENSSL_STATIC_ARMCAP is about *keeping* NEON as well as run-time switch *code*, just setting OPENSSL_armcap_P to a chosen value at compile time... I mean it looks like we somehow started to talk about different things... When I wrote "care to make suggestion" I was thinking about going through all #if __ARM_ARCH__>=7 and complementing some of them with !defined(something_M)... > > Alternatively, similar to what BoringSSL did, you could have an option > > that says "instead of doing runtime feature detection, instead detect > > features at compile time based on __ARM_NEON__ and the like." I think > > such a configuration would also help the C compiler do whole-program > > optimization better. > > I doubt that, because compiler doesn't look at assembly modules. > > > For example, in the AES-GCM code, there is a runtime check to decide > between various implementations. With the OPENSSL_STATIC_ARMCAP-like > approach, in theory the compiler's constant propagation and dead code > elimination can work together to automatically optimize away the code > paths that aren't applicable to the current configuration, without > needing to maintain lots of #ifdefs. Compiler might remove dead code it would generate itself, but it still won't omit anything from assembly module. Linker takes them in as monolithic blocks. -- openssl-dev mailing list To unsubscribe: https://mta.openssl.org/mailman/listinfo/openssl-dev