James Greenhalgh wrote: > > Have you tested this in cases where an integer dup is definitely the right > thing to do?
Yes, this still generates: #include <arm_neon.h> void f(unsigned a, unsigned b, uint32x4_t *c) { c[0] = vdupq_n_u32(a); c[1] = vdupq_n_u32(b); } dup v1.4s, w0 dup v0.4s, w1 str q1, [x2] str q0, [x2, 16] ret The reason is that the GP to FP register move cost is typically >= 5, while the additional cost of '?' is just 1. > And similar cases? If these still look good, then the patch is OK - though > I'm still very nervous about the register allocator cost model! Well it's complex and hard to get working well... However slightly preferring one variant works alright (unlike using '*' which results in incorrect costs). Wilco