https://gcc.gnu.org/bugzilla/show_bug.cgi?id=93565
--- Comment #31 from Wilco <wilco at gcc dot gnu.org> --- (In reply to Andrew Pinski from comment #29) > Looking back at this one, I (In reply to Wilco from comment #8) > > Here is a much simpler example: > > > > void f (int *p, int y) > > { > > int a = y & 14; > > *p = a | p[a]; > > } > After r14-9692-g839bc42772ba7af66af3bd16efed4a69511312ae, we now get: > f: > .LFB0: > .cfi_startproc > and w2, w1, 14 > mov x1, x2 > ldr w2, [x0, x2, lsl 2] > orr w1, w2, w1 > str w1, [x0] > ret > .cfi_endproc > > There is an extra move still but the duplicated and is gone. (with > -frename-registers added, the move is gone as REE is able to remove the zero > extend but then there is a life range conflict so can't remove the move too). Even with the mov it is better since that can be done with zero latency in rename in most CPUs. > So maybe this should be closed as fixed for GCC 14 and the cost changes for > clz reverted. The ctz costs are correct since it is a 2-instruction sequence - it only needs adjusting for CSSC.