Hi Jakub,

>> I think a better way forward would be to make the builtin_clz/ctz more 
>> defined.
>> Having undefined values is a source of unnecessary bugs given practically all
>> modern targets return the number of bits for the zero input - it is 
>> relatively
>> easy to ensure this on the few targets that don't.
>
> Well, e.g. i?86/x86_64 in most commonly used CPU flags is really undefined
> (the register is unchanged).  And -1 is also quite commonly used value,
> e.g. powerpc, gcn, xtensa.

So wouldn't it be easy to initialize the register before you do the bsr to get
the same result as with BMI? I don't think an extra mov can affect performance
in actual code (and GCC could still optimize the zero case if the input range
doesn't include zero).

-1 is more complex, if these targets don't want to add extra instructions to fix
it up, we could define the zero result either -1 or #bits depending on the 
target
(still better than completely undefined).

Cheers,
Wilco

Reply via email to