Andy Polyakov <ap...@openssl.org> wrote: >> > Cortex-M platforms are so limited that every bit of performance and >> > space savings matters. So, I think it is definitely worthwhile to >> > support the non-NEON ARMv7-M configuration. One easy way to do this >> > would be to avoid building NEON code when __TARGET_PROFILE_M is >> defined. >> >> I don't see no __TARGET_PROFILE_M defined by gcc >> >> >> I see. I didn't realize that GCC didn't emulate this ARM compiler >> feature. Never mind. > > But gcc defines __ARM_ARCH_7M__, which can be used to e.g.
Thanks. That's useful to know. >> I can try to make a patch to bring BoringSSL's OPENSSL_STATIC_ARMCAP >> mechanism to OpenSSL, if you think that is an OK approach. > > I don't understand. Original question was about conditional *omission* > of NEON code (which incidentally means even omission of run-time > switches), while BoringSSL's OPENSSL_STATIC_ARMCAP is about *keeping* > NEON as well as run-time switch *code*, just setting OPENSSL_armcap_P to > a chosen value at compile time... I mean it looks like we somehow > started to talk about different things... When I wrote "care to make > suggestion" I was thinking about going through all #if __ARM_ARCH__>=7 > and complementing some of them with !defined(something_M)... > Compiler might remove dead code it would generate itself, but it still > won't omit anything from assembly module. Linker takes them in as > monolithic blocks. If the target is Cortex-M4, there is no NEON. So then, with the OPENSSL_STATIC_ARMCAP, we won't set define OPENSSL_STATIC_ARMCAP_NEON and so that bit of the armcap variable won't be set. I think what you're trying to say is that, if we just stop there, then all the NEON code will still get linked in. That's true. But, what I mean is that we should then also change all the tests of the NEON bit of OPENSSL_armcap_P (and, more generally, all tests of OPENSSL_armcap_P) to use code that the C compiler can do constant propagation and dead code elimination on. We can do this, for example, by defining `OPENSSL_armcap_P` to be a macro that can be seen to have a constant compile-time value, when using the OPENSSL_STATIC_ARMCAP mechanism. And/or, we can surround the relevant code with `#if !defined(OPENSSL_STATIC_ARMCAP ) || defined(OPENSSL_STATIC_ARMCAP_NEON)`, etc. This latter technique would (IIUC) work even in the assembly language files. In this way, if we know at build time that NEON will be available, we can avoid compiling/linking the non-NEON code. Conversely, if we know that NEON will NOT be available, we can avoid compiling/linking the NEON code. I hope this clarifies my suggestion. Cheers, Brian -- https://briansmith.org/ -- openssl-dev mailing list To unsubscribe: https://mta.openssl.org/mailman/listinfo/openssl-dev