On Fri, 23 Jan 2015, Niels Möller wrote:
[email protected] (Niels Möller) writes:
I haven't done the memory barrier thing yet, it appears to be more
complicated than I had hoped. The manual I have say that the dmb
instruction (data memory barrier) is available only with armv7 and
later. And that armv6 uses writes to CP15 registers (I haven't yet tried
to figure what that means out, or if this method works also on later
versions).
I think I've found a simple solution. I deleted the initialized flag in
fat_init, instead I let each caller read the particular function pointer
it is interested in, and check if it is already properly initialized or
not. I.e., check if the current value equals its static initializer, and
if so, call fat_init.
This way, store order consistency between threads no longer matters, and
we won't need any memory barriers.
I'd like to merge this code on the master branch soon. It would be nice
if anyone else could give it a little testing, in particular on various
ARM devices. I've tested it on a few different x86_64 pc:s and an ARMv7
pandaboard, all running gnu/linux.
I tested it on a raspberry pi (ARMv6), and it seems to work pretty much as
intended - I was able to do a fat build with neon, while executing the
testsuite works (so the detection seems to work as intended).
I also tested building for ARMv5 using the android NDK, and I noted that
arm/v6/aes*.asm require a ".arch armv6" at the start, otherwise they fail
to assemble in that configuration. (The neon sources seem to have ".fpu
neon" similarly already. I'm not sure if some of the neon source perhaps
would require an ".arch armv7-a" as well, but they did seem to build just
fine in my test so perhaps it isn't necessary.)
To test this for yourself in case you're interested, add
<ndk>/toolchains/arm-linux-androideabi-4.6/prebuilt/*x86*/bin to your
path, configure with this line:
SYSROOT=<ndk>/platforms/android-3/arch-arm/
CC="arm-linux-androideabi-gcc --sysroot=$SYSROOT"
CXX="arm-linux-androideabi-g++ --sysroot=$SYSROOT" ./configure
--host=arm-linux-gnueabi --enable-fat
Other than that, building with --enable-fat does seem to do the right
thing - much better than the current setup. (E.g. currently, if
cross-compiling for raspberry pi, it fails to enable the v6 routines,
since the host triplet is arm-bcm2708hardfp-linux-gnueabi even though it's
a armv6 device. When building on such a device, config.guess gives
armv6l-unknown-linux-gnueabihf instead.)
I take it you've tested building for windows? Although the x86 detection
should be much simpler, so it's only the absence of ifunc that'd be tested
there.
// Martin
_______________________________________________
nettle-bugs mailing list
[email protected]
http://lists.lysator.liu.se/mailman/listinfo/nettle-bugs