On Fri, 23 Jan 2015, Niels Möller wrote:

[email protected] (Niels Möller) writes:

I haven't done the memory barrier thing yet, it appears to be more
complicated than I had hoped. The manual I have say that the dmb
instruction (data memory barrier) is available only with armv7 and
later. And that armv6 uses writes to CP15 registers (I haven't yet tried
to figure what that means out, or if this method works also on later
versions).

I think I've found a simple solution. I deleted the initialized flag in
fat_init, instead I let each caller read the particular function pointer
it is interested in, and check if it is already properly initialized or
not. I.e., check if the current value equals its static initializer, and
if so, call fat_init.

This way, store order consistency between threads no longer matters, and
we won't need any memory barriers.

I'd like to merge this code on the master branch soon. It would be nice
if anyone else could give it a little testing, in particular on various
ARM devices. I've tested it on a few different x86_64 pc:s and an ARMv7
pandaboard, all running gnu/linux.

I tested it on a raspberry pi (ARMv6), and it seems to work pretty much as intended - I was able to do a fat build with neon, while executing the testsuite works (so the detection seems to work as intended).

I also tested building for ARMv5 using the android NDK, and I noted that arm/v6/aes*.asm require a ".arch armv6" at the start, otherwise they fail to assemble in that configuration. (The neon sources seem to have ".fpu neon" similarly already. I'm not sure if some of the neon source perhaps would require an ".arch armv7-a" as well, but they did seem to build just fine in my test so perhaps it isn't necessary.)

To test this for yourself in case you're interested, add <ndk>/toolchains/arm-linux-androideabi-4.6/prebuilt/*x86*/bin to your path, configure with this line:
SYSROOT=<ndk>/platforms/android-3/arch-arm/
CC="arm-linux-androideabi-gcc --sysroot=$SYSROOT" CXX="arm-linux-androideabi-g++ --sysroot=$SYSROOT" ./configure --host=arm-linux-gnueabi --enable-fat

Other than that, building with --enable-fat does seem to do the right thing - much better than the current setup. (E.g. currently, if cross-compiling for raspberry pi, it fails to enable the v6 routines, since the host triplet is arm-bcm2708hardfp-linux-gnueabi even though it's a armv6 device. When building on such a device, config.guess gives armv6l-unknown-linux-gnueabihf instead.)


I take it you've tested building for windows? Although the x86 detection should be much simpler, so it's only the absence of ifunc that'd be tested there.

// Martin
_______________________________________________
nettle-bugs mailing list
[email protected]
http://lists.lysator.liu.se/mailman/listinfo/nettle-bugs

Reply via email to