On Thu, Oct 08, 2020 at 11:22:34AM +0000, Wilco Dijkstra wrote:
> >> I think a better way forward would be to make the builtin_clz/ctz more 
> >> defined.
> >> Having undefined values is a source of unnecessary bugs given practically 
> >> all
> >> modern targets return the number of bits for the zero input - it is 
> >> relatively
> >> easy to ensure this on the few targets that don't.
> >
> > Well, e.g. i?86/x86_64 in most commonly used CPU flags is really undefined
> > (the register is unchanged).  And -1 is also quite commonly used value,
> > e.g. powerpc, gcn, xtensa.
> 
> So wouldn't it be easy to initialize the register before you do the bsr to get
> the same result as with BMI? I don't think an extra mov can affect performance
> in actual code (and GCC could still optimize the zero case if the input range
> doesn't include zero).
> 
> -1 is more complex, if these targets don't want to add extra instructions to 
> fix
> it up, we could define the zero result either -1 or #bits depending on the 
> target
> (still better than completely undefined).

Having it undefined allows optimizations, and has been that way for years.
We just should make sure that we optimize code like x ? __builtin_c[lt]z (x) : 
32;
etc. properly (and I believe we do).

        Jakub

Reply via email to