Hi,

The attached patches implement three new (overloaded) intrinsics for ARM 
targets:
    T __builtin_arm_ldrex(T *addr)
    int __builtin_arm_strex(T val, T *addr)
    void __builtin_arm_clrex()

The idea is that (with quite a bit of hedging about backend register allocation 
and spills) these can be used to implement the higher-level atomic operations 
that already exist (and many more that don't happen to fit what x86 can do) in 
normal C and C++ code.

There's some precedent in other compilers for instructions like these. ARM's 
own RVCT implements the bare __ldrex and __strex intrinsics on a narrower range 
of types, and from the documentation I believe these are compatible on the 
common subset.

On the low level details, there were two choices of how to emit an (e.g.) short 
ldrex:
    call i16 @llvm.arm.ldrex.i16(i8* %addr)
    call i32 @llvm.arm.ldrex.p0i16(i16* %addr)

I have almost complete implementations of both and eventually decided that the 
latter was more natural: backends just aren't designed to handle intrinsics 
that actually need lowering and hacks were needed in multiple places to work 
around this.

The disadvantage is extra extensions and truncations occuring with every short 
exclusive operation instead of just when needed, but I've made an effort to 
fold those in a reasonably sane manner in the backend.

I've added some documentation to Clang, but there didn't seem to be any 
precedent for documenting target-specific intrinsics on the LLVM side, so I 
didn't do that.

Any comments? Can I commit?

Cheers.

Tim.

Attachment: ldrex-clang.diff
Description: Binary data

Attachment: ldrex-llvm.diff
Description: Binary data

_______________________________________________
cfe-commits mailing list
[email protected]
http://lists.cs.uiuc.edu/mailman/listinfo/cfe-commits

Reply via email to