On 06/06/16 21:44, Jeff Hain wrote: > With multiply it's faster when not inlined, but slower when inlined. > For some reason the score error is smaller with multiply.
The other thing to bear in mind is that performance depends on how the method is used. For example, if a benchmark tests uses positive numbers the negative case will be moved out of line and all we're left with is a single test and branch instruction: tbnz x10, #63, deoptimize_label This instruction will usually be correctly predicted so costs very little, maybe nothing. So you will get different results depending on the banchmark, and the benchmark thus needs to be representative of real-world use. Whatever that is! This optimization game is hard. :-) Andrew.