>
>
>> It does seem very odd that LLVM wouldn't automatically inline a function 
> consisting of a single instruction.
>

I've discovered through trial and error it is the lack of the "readnone" 
modifier which causes LLVM to not inline the function. After looking up 
that modifier I can see why that would be the case, and indeed why the lack 
of that modifier would penalise optimisation of ARM NEON generated because 
LLVM will assume every such function not so marked will change outcomes if 
global memory state could have been changed. In particular, it would 
severely restrict the reordering of instructions LLVM could do.

Quite a few of the ARM NEON builtins are missing "readnone". None that I 
can see of the AVX builtins is missing it. I am surprised this problem 
hasn't been raised before, it's very obvious from the assembler output.
 

>
> I've asked my employer for the time to send a pull request. If it's 
> granted, happy to oblige.
>
> I've been allowed this time by my employer who wishes to remain anonymous. 
I'll issue a pull request next week which applies nounwind readnone 
alwaysinline to everything in the NEON builtins, using the AVX builtins as 
a guide. I should think this will improve the optimisation quality of the 
NEON output quite a bit wherever it uses the builtins.

Niall

-- 
You received this message because you are subscribed to the Google Groups 
"Intel SPMD Program Compiler Users" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to [email protected].
For more options, visit https://groups.google.com/d/optout.

Reply via email to