On Mon, 8 Jun 2026 08:13:18 GMT, Fei Gao <[email protected]> wrote:

>>> @fg1417 Nice progress, I had some responses and new comments above. Main 
>>> new idea: what about Vector API vectors that create these patterns, do they 
>>> also get optimized by your changes now?
>> 
>> Hi @eme64,
>> Thanks for your reviewing!
>> I've already created the Vector API benchmark locally, but I'm currently 
>> waiting for access to testing resources. Sorry for the delay, and thanks for 
>> your patience.
>
>> @fg1417 Nice progress, I had some responses and new comments above. Main new 
>> idea: what about Vector API vectors that create these patterns, do they also 
>> get optimized by your changes now?
> 
> Hi @eme64, thanks for your patience.
> 
> I’ve pushed the Vector API microbenchmarks in 
> `test/micro/org/openjdk/bench/jdk/incubator/vector/LongVectorReduction.java` 
> that mirror the auto-vectorization patterns, along with the corresponding IR 
> test cases. The change also benefits these Vector API microbenchmarks.
> 
> On an `Arm Neoverse V2` platform, I observed the following results:
> 
> 
> Benchmark                                 (size)    Mode   Cnt   Units        
> uplift          
> LongVectorReduction.addBig                 512     thrpt    5    ops/ms       
>  2.97%
> LongVectorReduction.addBig                 2048    thrpt    5    ops/ms       
>  0.37%
> LongVectorReduction.addDotProduct          512     thrpt    5    ops/ms       
>  50.99%
> LongVectorReduction.addDotProduct          2048    thrpt    5    ops/ms       
>  49.95%
> LongVectorReduction.addDotProductShared    512     thrpt    5    ops/ms       
>  0.29%
> LongVectorReduction.addDotProductShared    2048    thrpt    5    ops/ms       
>  -0.01%
> LongVectorReduction.ifElsePhiAdd           512     thrpt    5    ops/ms       
>  8.50%
> LongVectorReduction.ifElsePhiAdd           2048    thrpt    5    ops/ms       
>  16.04%
> LongVectorReduction.ifElsePhiSub           512     thrpt    5    ops/ms       
>  10.55%
> LongVectorReduction.ifElsePhiSub           2048    thrpt    5    ops/ms       
>  11.78%
> LongVectorReduction.subDotProduct          512     thrpt    5    ops/ms       
>  50.74%
> LongVectorReduction.subDotProduct          2048    thrpt    5    ops/ms       
>  50.49%
> 
> 
> Thanks!

@fg1417 Thanks for the updates and benchmarks! I think the code is reasonable. 
I gave the PR another scan :)

-------------

PR Comment: https://git.openjdk.org/jdk/pull/30237#issuecomment-4658912092

Reply via email to