> The ARM assembly code implements some probing for the CPU at runtime > and uses hard coded byte sequences for the probe code. > > This way does not work for arm object file formats using new big endian > (BE8) mode. It is, however, very simple to fix: > > The armv4cpuid.S code (generated from armv4cpuid.PL) looks like this: > > _armv8_aes_probe: > .byte 0x00,0x03,0xb0,0xf3 @ aese.8 q0,q0 > bx lr > > but it should simply be: > > _armv8_aes_probe: > .inst 0xf3b00300 @ aese.8 q0,q0 > bx lr
Original rationale behind choice is following. Unfortunately support for .inst was added relatively recently and, as people are not always in position to update their assembler, using .inst would break compilations. So what to do under the circumstances? ARMv7 (and later) picks instructions always in little-endian byte order(*), even when it's operating in big-endian mode. If there is something that ensures that instruction in question is not reached on platform other than ARMv7 or later, then .byte would be the correct choice in *all* situations. Is there such provision? Yes, probe for crypto extensions if guarded by probe for ARMv7 NEON and NEON probe instruction is assembled by its mnemonic. Or in other words if NEON probe passes, then we know that above .byte encoding is going to be correct one. > The .inst pseudo op outputs 32bit as instruction (with whatever byte swapping > or $a markup is required for the object format in use) - hint from Matt > Thomas. > > Fixing this may be enough to remove the "can not build universal binary for > big endian arm" configure failure, with maybe a few ifdefs for BE8 vs. BE32 > big endian formats. Universal binary means that you can take literally same binary and execute it on several processors of same endianness. We're talking about big-endian for the moment. But recall that ARMv7 picks instructions in little-endian order even when it's operating in big-endian mode(*). That means that if big-endian universal binary was option, then code would have to converted on the fly, as it gets loaded to memory for execution. Is there system that does it? ... As for BE8. As far as I understand it allows you to change instruction endianness at link stage, i.e. by the time you already have decided if code is going to be executed on ARMv7 or pre-ARMv7. Which doesn't give you big-endian universal binary :-( > While here I notice that the generated .S files always encode "RET" as "bx > lr", > even on armv4 machines. This is done only in ARMv>=7 code paths. I mean if concern is that ARMv4 processor is not capable of executing bx lr, then you have to recall that it won't be able to execute a whole bunch of instructions preceding it. But it never attempts to execute them, because there is run-time switch controlled by processor capability bit-mask. (*) There is an exclusion, namely ARMv7-R profile, but rationale is that NEON is unlikely to be an option, so that NEON probe will fail, and AES probe won't be executed. _______________________________________________ openssl-dev mailing list To unsubscribe: https://mta.openssl.org/mailman/listinfo/openssl-dev
