That may not be a good idea. The vast majority of OpenSSL in use isn't targetted at a specific processor variant. It's compiled by an OS vendor and then installed on whatever. IF you are in the situation where you are compiling for a space constrained embedded processor then hopefully your engineers also have enough smarts to fix the code. I'd also point out that a lot of dev. setups for embedded aren't actually compiled on the target machine either so auto-detection at build time isn't that sensible anyway.
The problem here is you can't have both and having the capability switch at runtime depending on hardware quirks is the better option for the majority of users. You certainly don't want to mess with the runtime OPENSSL_armcap_P as that likely breaks 'the rest of the world' (tm) Peter From: Brian Smith <br...@briansmith.org> To: openssl-dev@openssl.org Date: 08/06/2016 11:49 Subject: Re: [openssl-dev] Making assembly language optimizations working on Cortex-M3 Sent by: "openssl-dev" <openssl-dev-boun...@openssl.org> Andy Polyakov <ap...@openssl.org> wrote: >> > Cortex-M platforms are so limited that every bit of performance and >> > space savings matters. So, I think it is definitely worthwhile to >> > support the non-NEON ARMv7-M configuration. One easy way to do this >> > would be to avoid building NEON code when __TARGET_PROFILE_M is defined. >> >> I don't see no __TARGET_PROFILE_M defined by gcc >> >> >> I see. I didn't realize that GCC didn't emulate this ARM compiler >> feature. Never mind. > > But gcc defines __ARM_ARCH_7M__, which can be used to e.g. Thanks. That's useful to know. >> I can try to make a patch to bring BoringSSL's OPENSSL_STATIC_ARMCAP >> mechanism to OpenSSL, if you think that is an OK approach. > > I don't understand. Original question was about conditional *omission* > of NEON code (which incidentally means even omission of run-time > switches), while BoringSSL's OPENSSL_STATIC_ARMCAP is about *keeping* > NEON as well as run-time switch *code*, just setting OPENSSL_armcap_P to > a chosen value at compile time... I mean it looks like we somehow > started to talk about different things... When I wrote "care to make > suggestion" I was thinking about going through all #if __ARM_ARCH__>=7 > and complementing some of them with !defined(something_M)... > Compiler might remove dead code it would generate itself, but it still > won't omit anything from assembly module. Linker takes them in as > monolithic blocks. If the target is Cortex-M4, there is no NEON. So then, with the OPENSSL_STATIC_ARMCAP, we won't set define OPENSSL_STATIC_ARMCAP_NEON and so that bit of the armcap variable won't be set. I think what you're trying to say is that, if we just stop there, then all the NEON code will still get linked in. That's true. But, what I mean is that we should then also change all the tests of the NEON bit of OPENSSL_armcap_P (and, more generally, all tests of OPENSSL_armcap_P) to use code that the C compiler can do constant propagation and dead code elimination on. We can do this, for example, by defining `OPENSSL_armcap_P` to be a macro that can be seen to have a constant compile-time value, when using the OPENSSL_STATIC_ARMCAP mechanism. And/or, we can surround the relevant code with `#if !defined(OPENSSL_STATIC_ARMCAP ) || defined(OPENSSL_STATIC_ARMCAP_NEON)`, etc. This latter technique would (IIUC) work even in the assembly language files. In this way, if we know at build time that NEON will be available, we can avoid compiling/linking the non-NEON code. Conversely, if we know that NEON will NOT be available, we can avoid compiling/linking the NEON code. I hope this clarifies my suggestion. Cheers, Brian -- https://briansmith.org/ -- openssl-dev mailing list To unsubscribe: https://mta.openssl.org/mailman/listinfo/openssl-dev
-- openssl-dev mailing list To unsubscribe: https://mta.openssl.org/mailman/listinfo/openssl-dev