Re: [openssl-dev] Making assembly language optimizations working on Cortex-M3

Andy Polyakov Wed, 25 May 2016 15:25:52 -0700

> 
>     
> http://git.openssl.org/gitweb/?p=openssl.git;a=commitdiff;h=11208dcfb9105e8afa37233185decefd45e89e17
>     made whole assembly pack Thumb2-friendly, so that now you should be able
>     to compile all modules. Please, double-check.
> 
> 
> This is awesome!
> 
> I have a question about the `it` and `itt` instructions you inserted.
> You wrapped them in `#ifdef __thumb2__`, which is not wrong, but AFAICT
> is usually unnecessary. Is this to support some old assemblers that
> don't compile `it` (etc.) into nothing for non-Thumb builds?


Yes. Note that #ifdefs are normally omitted in NEON code paths, because
assemblers capable of assembling NEON code are assumed to handle even
'it' without __thumb2__.

>     There is no option to
>     disable NEON (yet?), because a) I want to expose it to more build cases
>     to catch eventual bugs; b) would like to suggest idea of supporting
>     Cortex-M with -march=armv6t2 -mthumb. Latter means that you'll loose
>     some performance, because it won't utilize word load instruction's
>     capability to handle misaligned access in ARMv7. But on the other hand
>     it won't have ideas about compiling NEON, and you'll be excused to think
>     about which particular Cortex-M is targeted, one will be able to cover
>     all with single config/buid. Can it be viable compromise? One would
>     still be able to tune for favorite Mx...
> 
> 
> For Cortex-M4 and friends, one would really want to use the
> full ARMv7-M instruction set (i.e. not compile for armv6t2). In general
> Cortex-M platforms are so limited that every bit of performance and
> space savings matters. So, I think it is definitely worthwhile to
> support the non-NEON ARMv7-M configuration. One easy way to do this
> would be to avoid building NEON code when __TARGET_PROFILE_M is defined.

I don't see no __TARGET_PROFILE_M defined by gcc... Or do you mean that
*we* can be defined in arm_arch.h? Or maybe you are talking about ARM's
compiler... Anyway, care to make a suggestion in form of patch? That
would be suitable even for gcc? [Just in case, no, I don't have ARM's
compiler, only its manual.]

> Alternatively, similar to what BoringSSL did, you could have an option
> that says "instead of doing runtime feature detection, instead detect
> features at compile time based on __ARM_NEON__ and the like." I think
> such a configuration would also help the C compiler do whole-program
> optimization better.

I doubt that, because compiler doesn't look at assembly modules.

-- 
openssl-dev mailing list
To unsubscribe: https://mta.openssl.org/mailman/listinfo/openssl-dev

Re: [openssl-dev] Making assembly language optimizations working on Cortex-M3

Reply via email to