Hi Jakub, >> I think a better way forward would be to make the builtin_clz/ctz more >> defined. >> Having undefined values is a source of unnecessary bugs given practically all >> modern targets return the number of bits for the zero input - it is >> relatively >> easy to ensure this on the few targets that don't. > > Well, e.g. i?86/x86_64 in most commonly used CPU flags is really undefined > (the register is unchanged). And -1 is also quite commonly used value, > e.g. powerpc, gcn, xtensa.
So wouldn't it be easy to initialize the register before you do the bsr to get the same result as with BMI? I don't think an extra mov can affect performance in actual code (and GCC could still optimize the zero case if the input range doesn't include zero). -1 is more complex, if these targets don't want to add extra instructions to fix it up, we could define the zero result either -1 or #bits depending on the target (still better than completely undefined). Cheers, Wilco