Re: RFR: 8279508: Auto-vectorize Math.round API [v18]

2022-03-24 Thread Jatin Bhateja
On Wed, 23 Mar 2022 06:55:50 GMT, Tobias Hartmann  wrote:

>> Jatin Bhateja has updated the pull request with a new target base due to a 
>> merge or a rebase. The pull request now contains 22 commits:
>> 
>>  - 8279508: Using an explicit scratch register since rscratch1 is bound to 
>> r10 and its usage is transparent to compiler.
>>  - Merge branch 'master' of http://github.com/openjdk/jdk into JDK-8279508
>>  - 8279508: Windows build failure fix.
>>  - 8279508: Styling comments resolved.
>>  - 8279508: Creating separate test for round double under feature check.
>>  - 8279508: Reducing the invocation count and compile thresholds for 
>> RoundTests.java.
>>  - 8279508: Review comments resolution.
>>  - 8279508: Preventing domain switch-over penalty for Math.round(float) and 
>> constraining unrolling to prevent code bloating.
>>  - Merge branch 'master' of http://github.com/openjdk/jdk into JDK-8279508
>>  - 8279508: Removing +LogCompilation flag.
>>  - ... and 12 more: 
>> https://git.openjdk.java.net/jdk/compare/ff0b0927...c17440cf
>
> All tests passed.

Hi @TobiHartmann , thanks for confirming.
Hi @jddarcy , @theRealAph , kindly let me know if its good to integrate this.

-

PR: https://git.openjdk.java.net/jdk/pull/7094


Re: RFR: 8279508: Auto-vectorize Math.round API [v18]

2022-03-23 Thread Tobias Hartmann
On Fri, 18 Mar 2022 20:19:08 GMT, Jatin Bhateja  wrote:

>> Summary of changes:
>> - Intrinsify Math.round(float) and Math.round(double) APIs.
>> - Extend auto-vectorizer to infer vector operations on encountering scalar 
>> IR nodes for above intrinsics.
>> - Test creation using new IR testing framework.
>> 
>> Following are the performance number of a JMH micro included with the patch 
>> 
>> Test System: Intel(R) Xeon(R) Platinum 8380 CPU @ 2.30GHz (Icelake Server)
>> 
>> 
>> Benchmark | TESTSIZE | Baseline AVX3 (ops/ms) | Withopt AVX3 (ops/ms) | Gain 
>> ratio | Baseline AVX2 (ops/ms) | Withopt AVX2 (ops/ms) | Gain ratio
>> -- | -- | -- | -- | -- | -- | -- | --
>> FpRoundingBenchmark.test_round_double | 1024.00 | 504.15 | 2209.54 | 4.38 | 
>> 510.36 | 548.39 | 1.07
>> FpRoundingBenchmark.test_round_double | 2048.00 | 293.64 | 1271.98 | 4.33 | 
>> 293.48 | 274.01 | 0.93
>> FpRoundingBenchmark.test_round_float | 1024.00 | 825.99 | 4754.66 | 5.76 | 
>> 751.83 | 2274.13 | 3.02
>> FpRoundingBenchmark.test_round_float | 2048.00 | 412.22 | 2490.09 | 6.04 | 
>> 388.52 | 1334.18 | 3.43
>> 
>> 
>> Kindly review and share your feedback.
>> 
>> Best Regards,
>> Jatin
>
> Jatin Bhateja has updated the pull request with a new target base due to a 
> merge or a rebase. The pull request now contains 22 commits:
> 
>  - 8279508: Using an explicit scratch register since rscratch1 is bound to 
> r10 and its usage is transparent to compiler.
>  - Merge branch 'master' of http://github.com/openjdk/jdk into JDK-8279508
>  - 8279508: Windows build failure fix.
>  - 8279508: Styling comments resolved.
>  - 8279508: Creating separate test for round double under feature check.
>  - 8279508: Reducing the invocation count and compile thresholds for 
> RoundTests.java.
>  - 8279508: Review comments resolution.
>  - 8279508: Preventing domain switch-over penalty for Math.round(float) and 
> constraining unrolling to prevent code bloating.
>  - Merge branch 'master' of http://github.com/openjdk/jdk into JDK-8279508
>  - 8279508: Removing +LogCompilation flag.
>  - ... and 12 more: 
> https://git.openjdk.java.net/jdk/compare/ff0b0927...c17440cf

All tests passed.

-

PR: https://git.openjdk.java.net/jdk/pull/7094


Re: RFR: 8279508: Auto-vectorize Math.round API [v18]

2022-03-22 Thread Tobias Hartmann
On Fri, 18 Mar 2022 20:19:08 GMT, Jatin Bhateja  wrote:

>> Summary of changes:
>> - Intrinsify Math.round(float) and Math.round(double) APIs.
>> - Extend auto-vectorizer to infer vector operations on encountering scalar 
>> IR nodes for above intrinsics.
>> - Test creation using new IR testing framework.
>> 
>> Following are the performance number of a JMH micro included with the patch 
>> 
>> Test System: Intel(R) Xeon(R) Platinum 8380 CPU @ 2.30GHz (Icelake Server)
>> 
>> 
>> Benchmark | TESTSIZE | Baseline AVX3 (ops/ms) | Withopt AVX3 (ops/ms) | Gain 
>> ratio | Baseline AVX2 (ops/ms) | Withopt AVX2 (ops/ms) | Gain ratio
>> -- | -- | -- | -- | -- | -- | -- | --
>> FpRoundingBenchmark.test_round_double | 1024.00 | 504.15 | 2209.54 | 4.38 | 
>> 510.36 | 548.39 | 1.07
>> FpRoundingBenchmark.test_round_double | 2048.00 | 293.64 | 1271.98 | 4.33 | 
>> 293.48 | 274.01 | 0.93
>> FpRoundingBenchmark.test_round_float | 1024.00 | 825.99 | 4754.66 | 5.76 | 
>> 751.83 | 2274.13 | 3.02
>> FpRoundingBenchmark.test_round_float | 2048.00 | 412.22 | 2490.09 | 6.04 | 
>> 388.52 | 1334.18 | 3.43
>> 
>> 
>> Kindly review and share your feedback.
>> 
>> Best Regards,
>> Jatin
>
> Jatin Bhateja has updated the pull request with a new target base due to a 
> merge or a rebase. The pull request now contains 22 commits:
> 
>  - 8279508: Using an explicit scratch register since rscratch1 is bound to 
> r10 and its usage is transparent to compiler.
>  - Merge branch 'master' of http://github.com/openjdk/jdk into JDK-8279508
>  - 8279508: Windows build failure fix.
>  - 8279508: Styling comments resolved.
>  - 8279508: Creating separate test for round double under feature check.
>  - 8279508: Reducing the invocation count and compile thresholds for 
> RoundTests.java.
>  - 8279508: Review comments resolution.
>  - 8279508: Preventing domain switch-over penalty for Math.round(float) and 
> constraining unrolling to prevent code bloating.
>  - Merge branch 'master' of http://github.com/openjdk/jdk into JDK-8279508
>  - 8279508: Removing +LogCompilation flag.
>  - ... and 12 more: 
> https://git.openjdk.java.net/jdk/compare/ff0b0927...c17440cf

Sure, I'll re-run testing and report back.

-

PR: https://git.openjdk.java.net/jdk/pull/7094


Re: RFR: 8279508: Auto-vectorize Math.round API [v18]

2022-03-18 Thread Jatin Bhateja
> Summary of changes:
> - Intrinsify Math.round(float) and Math.round(double) APIs.
> - Extend auto-vectorizer to infer vector operations on encountering scalar IR 
> nodes for above intrinsics.
> - Test creation using new IR testing framework.
> 
> Following are the performance number of a JMH micro included with the patch 
> 
> Test System: Intel(R) Xeon(R) Platinum 8380 CPU @ 2.30GHz (Icelake Server)
> 
> 
> Benchmark | TESTSIZE | Baseline AVX3 (ops/ms) | Withopt AVX3 (ops/ms) | Gain 
> ratio | Baseline AVX2 (ops/ms) | Withopt AVX2 (ops/ms) | Gain ratio
> -- | -- | -- | -- | -- | -- | -- | --
> FpRoundingBenchmark.test_round_double | 1024.00 | 504.15 | 2209.54 | 4.38 | 
> 510.36 | 548.39 | 1.07
> FpRoundingBenchmark.test_round_double | 2048.00 | 293.64 | 1271.98 | 4.33 | 
> 293.48 | 274.01 | 0.93
> FpRoundingBenchmark.test_round_float | 1024.00 | 825.99 | 4754.66 | 5.76 | 
> 751.83 | 2274.13 | 3.02
> FpRoundingBenchmark.test_round_float | 2048.00 | 412.22 | 2490.09 | 6.04 | 
> 388.52 | 1334.18 | 3.43
> 
> 
> Kindly review and share your feedback.
> 
> Best Regards,
> Jatin

Jatin Bhateja has updated the pull request with a new target base due to a 
merge or a rebase. The pull request now contains 22 commits:

 - 8279508: Using an explicit scratch register since rscratch1 is bound to r10 
and its usage is transparent to compiler.
 - Merge branch 'master' of http://github.com/openjdk/jdk into JDK-8279508
 - 8279508: Windows build failure fix.
 - 8279508: Styling comments resolved.
 - 8279508: Creating separate test for round double under feature check.
 - 8279508: Reducing the invocation count and compile thresholds for 
RoundTests.java.
 - 8279508: Review comments resolution.
 - 8279508: Preventing domain switch-over penalty for Math.round(float) and 
constraining unrolling to prevent code bloating.
 - Merge branch 'master' of http://github.com/openjdk/jdk into JDK-8279508
 - 8279508: Removing +LogCompilation flag.
 - ... and 12 more: https://git.openjdk.java.net/jdk/compare/ff0b0927...c17440cf

-

Changes: https://git.openjdk.java.net/jdk/pull/7094/files
 Webrev: https://webrevs.openjdk.java.net/?repo=jdk=7094=17
  Stats: 800 lines in 25 files changed: 707 ins; 30 del; 63 mod
  Patch: https://git.openjdk.java.net/jdk/pull/7094.diff
  Fetch: git fetch https://git.openjdk.java.net/jdk pull/7094/head:pull/7094

PR: https://git.openjdk.java.net/jdk/pull/7094