On Mon, 22 Dec 2025 09:40:42 GMT, Galder Zamarreño <[email protected]> wrote:

>> Yi Wu has updated the pull request with a new target base due to a merge or 
>> a rebase. The incremental webrev excludes the unrelated changes brought in 
>> by the merge/rebase. The pull request contains four additional commits since 
>> the last revision:
>> 
>>  - Replace assert with verify
>>  - Add IRNode constant and code refactor
>>  - Merge remote-tracking branch 'origin/master' into yiwu-8373344
>>  - 8373344: Add support for FP16 min/max reduction operations
>>    
>>    This patch adds mid-end support for vectorized min/max reduction
>>    operations for half floats. It also includes backend AArch64 support
>>    for these operations.
>>    Both floating point min/max reductions don’t require strict order,
>>    because they are associative.
>>    
>>    It will generate NEON fminv/fmaxv reduction instructions when
>>    max vector length is 8B or 16B. On SVE supporting machines
>>    with vector lengths > 16B, it will generate the SVE fminv/fmaxv
>>    instructions.
>>    The patch also adds support for partial min/max reductions on
>>    SVE machines using fminv/fmaxv.
>>    
>>    Ratio of throughput(ops/ms) > 1 indicates the performance with
>>    this patch is better than the mainline.
>>    
>>    Neoverse N1 (UseSVE = 0, max vector length = 16B):
>>    Benchmark         vectorDim  Mode   Cnt     8B    16B
>>    ReductionMaxFP16   256       thrpt 9      3.69   6.44
>>    ReductionMaxFP16   512       thrpt 9      3.71   7.62
>>    ReductionMaxFP16   1024      thrpt 9      4.16   8.64
>>    ReductionMaxFP16   2048      thrpt 9      4.44   9.12
>>    ReductionMinFP16   256       thrpt 9      3.69   6.43
>>    ReductionMinFP16   512       thrpt 9      3.70   7.62
>>    ReductionMinFP16   1024      thrpt 9      4.16   8.64
>>    ReductionMinFP16   2048      thrpt 9      4.44   9.10
>>    
>>    Neoverse V1 (UseSVE = 1, max vector length = 32B):
>>    Benchmark         vectorDim  Mode   Cnt     8B    16B    32B
>>    ReductionMaxFP16   256       thrpt 9      3.96   8.62   8.02
>>    ReductionMaxFP16   512       thrpt 9      3.54   9.25  11.71
>>    ReductionMaxFP16   1024      thrpt 9      3.77   8.71  14.07
>>    ReductionMaxFP16   2048      thrpt 9      3.88   8.44  14.69
>>    ReductionMinFP16   256       thrpt 9      3.96   8.61   8.03
>>    ReductionMinFP16   512       thrpt 9      3.54   9.28  11.69
>>    ReductionMinFP16   1024      thrpt 9      3.76   8.70  14.12
>>    ReductionMinFP16   2048      thrpt 9      3.87   8.45  14.70
>>    
>>    Neoverse V2 (UseSVE = 2, max vector length = 16B)...
>
> Thanks @yiwu0b11, some superficial comments

Thanks @galderz for the code review, I've updated the code and also replaced 
assert with [verify](https://github.com/openjdk/jdk/pull/28095)

-------------

PR Comment: https://git.openjdk.org/jdk/pull/28828#issuecomment-3710056269

Reply via email to