On Thu, 4 Sep 2025 06:26:36 GMT, Hannes Greule <hgre...@openjdk.org> wrote:

>> This patch optimizes PopCount value transforms using KnownBits information.
>> Following are the results of the micro-benchmark included with the patch
>> 
>> 
>> 
>> System: 13th Gen Intel(R) Core(TM) i3-1315U
>> 
>> Baseline:
>> Benchmark                                      Mode  Cnt       Score   Error 
>>  Units
>> PopCountValueTransform.LogicFoldingKerenLong  thrpt    2  215460.670         
>>  ops/s
>> PopCountValueTransform.LogicFoldingKerenlInt  thrpt    2  294014.826         
>>  ops/s
>> 
>> Withopt:
>> Benchmark                                      Mode  Cnt       Score   Error 
>>  Units
>> PopCountValueTransform.LogicFoldingKerenLong  thrpt    2  389978.082         
>>  ops/s
>> PopCountValueTransform.LogicFoldingKerenlInt  thrpt    2  417261.583         
>>  ops/s
>> 
>> 
>> Kindly review and share your feedback.
>> 
>> Best Regards,
>> Jatin
>
> The change looks good, but I wonder:
> 
> - if it makes sense to have some kind of IR tests (i.e., it's folded away 
> when unneeded, when the input is a constant, ...)?
> - whether the explanation could be simplified: Assuming a correct 
> implementation of the KnownBits canonicalization, we can argue
>       - `_zeroes` has the bits set that are known to be always 0. So 
> `BitsPer<Type> - popCount(x)` gives you an upper limit of how many bits 
> *might* be 1. And `BitsPer<Type> - popCount(_zeroes)` is equivalent to 
> `popCount(~_zeroes)`.
>       - `_ones` has the bits set that are known to be always 1. Trivially, 
> `popCount(_ones)` is a valid lower bound.
>       - The rest repeats how `adjust_bits_from_unsigned_bounds` works, but 
> that's not specific to the popcount nodes.

Hi @SirYwell , @chhagedorn , @eme64 , I have addressed your comments. Let me 
know if this is good to land in.

-------------

PR Comment: https://git.openjdk.org/jdk/pull/27075#issuecomment-3334870778

Reply via email to