On Thu, Oct 08, 2020 at 11:22:34AM +0000, Wilco Dijkstra wrote: > >> I think a better way forward would be to make the builtin_clz/ctz more > >> defined. > >> Having undefined values is a source of unnecessary bugs given practically > >> all > >> modern targets return the number of bits for the zero input - it is > >> relatively > >> easy to ensure this on the few targets that don't. > > > > Well, e.g. i?86/x86_64 in most commonly used CPU flags is really undefined > > (the register is unchanged). And -1 is also quite commonly used value, > > e.g. powerpc, gcn, xtensa. > > So wouldn't it be easy to initialize the register before you do the bsr to get > the same result as with BMI? I don't think an extra mov can affect performance > in actual code (and GCC could still optimize the zero case if the input range > doesn't include zero). > > -1 is more complex, if these targets don't want to add extra instructions to > fix > it up, we could define the zero result either -1 or #bits depending on the > target > (still better than completely undefined).
Having it undefined allows optimizations, and has been that way for years. We just should make sure that we optimize code like x ? __builtin_c[lt]z (x) : 32; etc. properly (and I believe we do). Jakub