> > >> It does seem very odd that LLVM wouldn't automatically inline a function > consisting of a single instruction. >
I've discovered through trial and error it is the lack of the "readnone" modifier which causes LLVM to not inline the function. After looking up that modifier I can see why that would be the case, and indeed why the lack of that modifier would penalise optimisation of ARM NEON generated because LLVM will assume every such function not so marked will change outcomes if global memory state could have been changed. In particular, it would severely restrict the reordering of instructions LLVM could do. Quite a few of the ARM NEON builtins are missing "readnone". None that I can see of the AVX builtins is missing it. I am surprised this problem hasn't been raised before, it's very obvious from the assembler output. > > I've asked my employer for the time to send a pull request. If it's > granted, happy to oblige. > > I've been allowed this time by my employer who wishes to remain anonymous. I'll issue a pull request next week which applies nounwind readnone alwaysinline to everything in the NEON builtins, using the AVX builtins as a guide. I should think this will improve the optimisation quality of the NEON output quite a bit wherever it uses the builtins. Niall -- You received this message because you are subscribed to the Google Groups "Intel SPMD Program Compiler Users" group. To unsubscribe from this group and stop receiving emails from it, send an email to [email protected]. For more options, visit https://groups.google.com/d/optout.
