Re: RFR: 8283667: [vectorapi] Vectorization for masked load with IOOBE with predicate feature [v6]

2022-06-07 Thread Jatin Bhateja
On Tue, 7 Jun 2022 04:29:40 GMT, Xiaohong Gong wrote: >> Currently the vector load with mask when the given index happens out of the >> array boundary is implemented with pure java scalar code to avoid the IOOBE >> (IndexOutOfBoundaryException). This is necessary for architectures that do >>

Re: RFR: 8283667: [vectorapi] Vectorization for masked load with IOOBE with predicate feature [v5]

2022-06-07 Thread Jatin Bhateja
On Tue, 7 Jun 2022 02:22:53 GMT, Xiaohong Gong wrote: >> test/micro/org/openjdk/bench/jdk/incubator/vector/LoadMaskedIOOBEBenchmark.java >> line 97: >> >>> 95: public void byteLoadArrayMaskIOOBE() { >>> 96: for (int i = 0; i < inSize; i += bspecies.length()) { >>> 97:

Re: RFR: 8283667: [vectorapi] Vectorization for masked load with IOOBE with predicate feature [v5]

2022-06-06 Thread Jatin Bhateja
On Thu, 2 Jun 2022 03:27:59 GMT, Xiaohong Gong wrote: >> Currently the vector load with mask when the given index happens out of the >> array boundary is implemented with pure java scalar code to avoid the IOOBE >> (IndexOutOfBoundaryException). This is necessary for architectures that do >>

Integrated: 8284960: Integration of JEP 426: Vector API (Fourth Incubator)

2022-05-31 Thread Jatin Bhateja
On Wed, 27 Apr 2022 11:03:48 GMT, Jatin Bhateja wrote: > Hi All, > > Patch adds the planned support for new vector operations and APIs targeted > for [JEP 426: Vector API (Fourth > Incubator).](https://bugs.openjdk.java.net/browse/JDK-8280173) > > Following is the bri

Re: RFR: 8284960: Integration of JEP 426: Vector API (Fourth Incubator) [v10]

2022-05-31 Thread Jatin Bhateja
On Wed, 25 May 2022 06:29:23 GMT, Jatin Bhateja wrote: >> Hi All, >> >> Patch adds the planned support for new vector operations and APIs targeted >> for [JEP 426: Vector API (Fourth >> Incubator).](https://bugs.openjdk.java.net/browse/JDK-8280173) >>

Re: RFR: 8284960: Integration of JEP 426: Vector API (Fourth Incubator) [v8]

2022-05-26 Thread Jatin Bhateja
On Wed, 25 May 2022 06:25:53 GMT, Jatin Bhateja wrote: >> src/hotspot/cpu/x86/assembler_x86.cpp line 8173: >> >>> 8171: >>> 8172: void Assembler::vinsertf32x4(XMMRegister dst, XMMRegister nds, >>> XMMRegister src, uint8_t imm8) { >>> 8173: ass

Re: RFR: 8284960: Integration of JEP 426: Vector API (Fourth Incubator) [v9]

2022-05-25 Thread Jatin Bhateja
On Wed, 25 May 2022 05:50:23 GMT, Jatin Bhateja wrote: >> Hi All, >> >> Patch adds the planned support for new vector operations and APIs targeted >> for [JEP 426: Vector API (Fourth >> Incubator).](https://bugs.openjdk.java.net/browse/JDK-8280173) >>

Re: RFR: 8284960: Integration of JEP 426: Vector API (Fourth Incubator) [v8]

2022-05-25 Thread Jatin Bhateja
On Mon, 23 May 2022 22:17:40 GMT, Vladimir Kozlov wrote: >> Jatin Bhateja has updated the pull request incrementally with one additional >> commit since the last revision: >> >> 8284960: Integrating incremental patches. > > src/hotspot/cpu/x86/assembler

Re: RFR: 8284960: Integration of JEP 426: Vector API (Fourth Incubator) [v10]

2022-05-25 Thread Jatin Bhateja
over AARCH64 and X86 targets different AVX levels. > > Kindly review and share your feedback. > > Best Regards, > Jatin Jatin Bhateja has updated the pull request with a new target base due to a merge or a rebase. The pull request now contains 20 commits: - 8284960: Post merg

Re: RFR: 8284960: Integration of JEP 426: Vector API (Fourth Incubator) [v9]

2022-05-24 Thread Jatin Bhateja
over AARCH64 and X86 targets different AVX levels. > > Kindly review and share your feedback. > > Best Regards, > Jatin Jatin Bhateja has updated the pull request incrementally with one additional commit since the last revision: 8284960: Review comments resolved. --

Re: RFR: 8284960: Integration of JEP 426: Vector API (Fourth Incubator) [v3]

2022-05-23 Thread Jatin Bhateja
On Thu, 12 May 2022 23:56:49 GMT, Vladimir Ivanov wrote: >> Jatin Bhateja has updated the pull request with a new target base due to a >> merge or a rebase. The pull request now contains 11 commits: >> >> - Merge branch 'master' of http://github.com/openjdk/jdk into J

Re: RFR: 8284960: Integration of JEP 426: Vector API (Fourth Incubator) [v7]

2022-05-20 Thread Jatin Bhateja
On Thu, 19 May 2022 21:19:49 GMT, Paul Sandoz wrote: >> Jatin Bhateja has updated the pull request with a new target base due to a >> merge or a rebase. The pull request now contains 16 commits: >> >> - Merge branch 'master' of http://github.com/openjdk/jdk into J

Re: RFR: 8284960: Integration of JEP 426: Vector API (Fourth Incubator) [v8]

2022-05-20 Thread Jatin Bhateja
over AARCH64 and X86 targets different AVX levels. > > Kindly review and share your feedback. > > Best Regards, > Jatin Jatin Bhateja has updated the pull request incrementally with one additional commit since the last revision: 8284960: Integrating incremental patches.

Re: RFR: 8284960: Integration of JEP 426: Vector API (Fourth Incubator) [v3]

2022-05-19 Thread Jatin Bhateja
On Thu, 19 May 2022 15:33:49 GMT, Jatin Bhateja wrote: >> Do you mean it's important to apply the transformation at the right node >> (pick the right node as the root) and it is hard to make a decision during >> GVN? > > Yes, that what I meant, but wit

Re: RFR: 8284960: Integration of JEP 426: Vector API (Fourth Incubator) [v7]

2022-05-19 Thread Jatin Bhateja
over AARCH64 and X86 targets different AVX levels. > > Kindly review and share your feedback. > > Best Regards, > Jatin Jatin Bhateja has updated the pull request with a new target base due to a merge or a rebase. The pull request now contains 16 commits: - Merge branch

Re: RFR: 8284960: Integration of JEP 426: Vector API (Fourth Incubator) [v3]

2022-05-19 Thread Jatin Bhateja
On Wed, 18 May 2022 23:35:54 GMT, Vladimir Ivanov wrote: >> It was an attempt to facilitate in-lining of these APIs over targets which >> do not intrinsify them. I agree its not a generic fix since three APIs are >> piggybacking on same entry point and without the knowledge of opcode it will

Re: RFR: 8284960: Integration of JEP 426: Vector API (Fourth Incubator) [v3]

2022-05-19 Thread Jatin Bhateja
On Wed, 18 May 2022 23:28:22 GMT, Vladimir Ivanov wrote: >> Its more of a chicken-egg problem here, for masked reverse operation, >> Reverse IR node is followed by a Blend Node, thus in such a case doing an >> eager Identity transform in Reverse::Identity will not work, also deferring >> this

Re: RFR: 8284960: Integration of JEP 426: Vector API (Fourth Incubator) [v6]

2022-05-17 Thread Jatin Bhateja
over AARCH64 and X86 targets different AVX levels. > > Kindly review and share your feedback. > > Best Regards, > Jatin Jatin Bhateja has updated the pull request incrementally with one additional commit since the last revision: 8284960: Adding --enable-preview

Re: RFR: 8284960: Integration of JEP 426: Vector API (Fourth Incubator) [v5]

2022-05-17 Thread Jatin Bhateja
over AARCH64 and X86 targets different AVX levels. > > Kindly review and share your feedback. > > Best Regards, > Jatin Jatin Bhateja has updated the pull request with a new target base due to a merge or a rebase. The pull request now contains 13 commits: - Merge branch

Re: RFR: 8284960: Integration of JEP 426: Vector API (Fourth Incubator) [v4]

2022-05-13 Thread Jatin Bhateja
On Thu, 12 May 2022 22:48:26 GMT, Vladimir Ivanov wrote: >> Jatin Bhateja has updated the pull request incrementally with one additional >> commit since the last revision: >> >> 8284960: Review comments resolution. > > src/hotspot/cpu/x86/stubGenerator_x8

Re: RFR: 8284960: Integration of JEP 426: Vector API (Fourth Incubator) [v3]

2022-05-13 Thread Jatin Bhateja
On Thu, 12 May 2022 22:40:50 GMT, Vladimir Ivanov wrote: >> Jatin Bhateja has updated the pull request with a new target base due to a >> merge or a rebase. The pull request now contains 11 commits: >> >> - Merge branch 'master' of http://github.com/openjdk/jdk into J

Re: RFR: 8284960: Integration of JEP 426: Vector API (Fourth Incubator) [v4]

2022-05-13 Thread Jatin Bhateja
over AARCH64 and X86 targets different AVX levels. > > Kindly review and share your feedback. > > Best Regards, > Jatin Jatin Bhateja has updated the pull request incrementally with one additional commit since the last revision: 8284960: Review comments resolution. -

Re: RFR: 8284960: Integration of JEP 426: Vector API (Fourth Incubator) [v3]

2022-05-10 Thread Jatin Bhateja
over AARCH64 and X86 targets different AVX levels. > > Kindly review and share your feedback. > > Best Regards, > Jatin Jatin Bhateja has updated the pull request with a new target base due to a merge or a rebase. The pull request now contains 11 commits: - Merge branch 'ma

Re: RFR: 8284960: Integration of JEP 426: Vector API (Fourth Incubator) [v2]

2022-05-05 Thread Jatin Bhateja
On Thu, 5 May 2022 05:47:47 GMT, Jatin Bhateja wrote: >> Hi All, >> >> Patch adds the planned support for new vector operations and APIs targeted >> for [JEP 426: Vector API (Fourth >> Incubator).](https://bugs.openjdk.java.net/browse/JDK-8280173) >>

Re: RFR: 8284960: Integration of JEP 426: Vector API (Fourth Incubator) [v2]

2022-05-04 Thread Jatin Bhateja
over AARCH64 and X86 targets different AVX levels. > > Kindly review and share your feedback. > > Best Regards, > Jatin Jatin Bhateja has updated the pull request with a new target base due to a merge or a rebase. The pull request now contains 10 commits: - 8284960: C

Re: RFR: 8284050: [vectorapi] Optimize masked store for non-predicated architectures [v2]

2022-05-04 Thread Jatin Bhateja
On Thu, 5 May 2022 03:17:35 GMT, Xiaohong Gong wrote: >> src/hotspot/share/opto/vectorIntrinsics.cpp line 1363: >> >>> 1361: // Use the vector blend to implement the masked store. The >>> biased elements are the original >>> 1362: // values in the memory. >>> 1363: Node*

RFR: 8284960: Integration of JEP 426: Vector API (Fourth Incubator)

2022-04-29 Thread Jatin Bhateja
Hi All, Patch adds the planned support for new vector operations and APIs targeted for [JEP 426: Vector API (Fourth Incubator).](https://bugs.openjdk.java.net/browse/JDK-8280173) Following is the brief summary of changes:- 1) Extends the scope of existing lanewise API for following new

Re: RFR: 8283667: [vectorapi] Vectorization for masked load with IOOBE with predicate feature [v2]

2022-04-28 Thread Jatin Bhateja
On Wed, 20 Apr 2022 02:44:39 GMT, Xiaohong Gong wrote: >>> The blend should be with the intended-to-store vector, so that masked lanes >>> contain the need-to-store elements and unmasked lanes contain the loaded >>> elements, which would be stored back, which results in unchanged values. >>

Re: RFR: 8284932: [Vector API] Incorrect implementation of LSHR operator for negative byte/short elements

2022-04-26 Thread Jatin Bhateja
On Sun, 17 Apr 2022 14:35:14 GMT, Jie Fu wrote: >> According to the Vector API doc, the LSHR operator computes >> a>>>(n&(ESIZE*8-1)) Documentation is correct if viewed strictly in context of subword vector lane, JVM internally promotes/sign extends subword type scalar variables into int

Re: RFR: 8283667: [vectorapi] Vectorization for masked load with IOOBE with predicate feature

2022-04-11 Thread Jatin Bhateja
On Thu, 31 Mar 2022 03:53:15 GMT, Xiaohong Gong wrote: >> Yeah, maybe I misunderstood what you mean. So maybe the masked store >> `(store(src, m))` could be implemented with: >> >> 1) v1 = load >> 2) v2 = blend(load, src, m) >> 3) store(v2) >> >> Let's record this a JBS and fix it with a

Re: RFR: 8282221: x86 intrinsics for divideUnsigned and remainderUnsigned methods in java.lang.Integer and java.lang.Long [v4]

2022-04-06 Thread Jatin Bhateja
On Mon, 4 Apr 2022 07:24:12 GMT, Vamsi Parasa wrote: >> Also need a jtreg test for this. > >> Also need a jtreg test for this. > > Thanks Sandhya for the review. Made the suggested changes and added jtreg > tests as well. Hi @vamsi-parasa , thanks for addressing my comments, looks good to me

Re: RFR: 8282221: x86 intrinsics for divideUnsigned and remainderUnsigned methods in java.lang.Integer and java.lang.Long [v9]

2022-04-06 Thread Jatin Bhateja
On Wed, 6 Apr 2022 06:02:07 GMT, Vamsi Parasa wrote: >> Optimizes the divideUnsigned() and remainderUnsigned() methods in >> java.lang.Integer and java.lang.Long classes using x86 intrinsics. This >> change shows 3x improvement for Integer methods and upto 25% improvement for >> Long. This

Re: RFR: 8283726: x86 intrinsics for compare method in Integer and Long

2022-03-28 Thread Jatin Bhateja
On Sun, 27 Mar 2022 06:15:34 GMT, Vamsi Parasa wrote: > Implements x86 intrinsics for compare() method in java.lang.Integer and > java.lang.Long. src/hotspot/cpu/x86/x86_64.ad line 12107: > 12105: instruct compareSignedI_rReg(rRegI dst, rRegI op1, rRegI op2, rRegI > tmp, rFlagsReg cr) >

Re: RFR: 8279508: Auto-vectorize Math.round API [v18]

2022-03-24 Thread Jatin Bhateja
On Wed, 23 Mar 2022 06:55:50 GMT, Tobias Hartmann wrote: >> Jatin Bhateja has updated the pull request with a new target base due to a >> merge or a rebase. The pull request now contains 22 commits: >> >> - 8279508: Using an explicit scratch register since rscra

Re: RFR: 8279508: Auto-vectorize Math.round API [v15]

2022-03-21 Thread Jatin Bhateja
On Tue, 22 Mar 2022 01:55:38 GMT, Quan Anh Mai wrote: >> A read from constant table will incur minimum of L1I access penalty to >> access code blob or at worst even more if data is not present in first level >> cache. Change was done for replace vpbroadcastd with vbroadcastss because of >>

Re: RFR: 8279508: Auto-vectorize Math.round API [v15]

2022-03-21 Thread Jatin Bhateja
On Mon, 21 Mar 2022 17:56:22 GMT, Quan Anh Mai wrote: >> constant and register to register moves are never issued to execution ports, >> rematerializing value rather than reading from memory will give better >> performance. > > I have come across this a little bit. While `movl r, i` may not

Re: RFR: 8279508: Auto-vectorize Math.round API [v17]

2022-03-18 Thread Jatin Bhateja
On Mon, 14 Mar 2022 10:35:58 GMT, Tobias Hartmann wrote: >> Jatin Bhateja has updated the pull request incrementally with one additional >> commit since the last revision: >> >> 8279508: Windows build failure fix. > > `compiler/c2/cr6340864/TestFloatVect.java`

Re: RFR: 8279508: Auto-vectorize Math.round API [v18]

2022-03-18 Thread Jatin Bhateja
und_float | 1024.00 | 825.99 | 4754.66 | 5.76 | > 751.83 | 2274.13 | 3.02 > FpRoundingBenchmark.test_round_float | 2048.00 | 412.22 | 2490.09 | 6.04 | > 388.52 | 1334.18 | 3.43 > > > Kindly review and share your feedback. > > Best Regards, > Jatin Jatin Bhateja has u

Re: RFR: 8279508: Auto-vectorize Math.round API [v15]

2022-03-14 Thread Jatin Bhateja
On Mon, 14 Mar 2022 09:29:28 GMT, Andrew Haley wrote: >> Good suggestion, but as of now we are not using vector calling conventions >> for stubs. > > I don't understand this comment. If the stub is only to be used by you, then > you can determine your own calling convention. We are passing

Re: RFR: 8279508: Auto-vectorize Math.round API [v17]

2022-03-12 Thread Jatin Bhateja
und_float | 1024.00 | 825.99 | 4754.66 | 5.76 | > 751.83 | 2274.13 | 3.02 > FpRoundingBenchmark.test_round_float | 2048.00 | 412.22 | 2490.09 | 6.04 | > 388.52 | 1334.18 | 3.43 > > > Kindly review and share your feedback. > > Best Regards, > Jatin Jatin Bhateja h

Re: RFR: 8279508: Auto-vectorize Math.round API [v16]

2022-03-12 Thread Jatin Bhateja
und_float | 1024.00 | 825.99 | 4754.66 | 5.76 | > 751.83 | 2274.13 | 3.02 > FpRoundingBenchmark.test_round_float | 2048.00 | 412.22 | 2490.09 | 6.04 | > 388.52 | 1334.18 | 3.43 > > > Kindly review and share your feedback. > > Best Regards, > Jatin Jatin Bhateja h

Re: RFR: 8279508: Auto-vectorize Math.round API [v15]

2022-03-12 Thread Jatin Bhateja
On Sun, 13 Mar 2022 00:06:07 GMT, Quan Anh Mai wrote: >> src/hotspot/cpu/x86/c2_MacroAssembler_x86.cpp line 4161: >> >>> 4159: movl(scratch, 1056964608); >>> 4160: movq(xtmp1, scratch); >>> 4161: vbroadcastss(xtmp1, xtmp1, vec_enc); >> >> An `evpbroadcastd` would reduce this by one

Re: RFR: 8279508: Auto-vectorize Math.round API [v15]

2022-03-12 Thread Jatin Bhateja
On Sat, 12 Mar 2022 23:20:58 GMT, Quan Anh Mai wrote: >> Jatin Bhateja has updated the pull request incrementally with one additional >> commit since the last revision: >> >> 8279508: Creating separate test for round double under feature check. > > src/hotspot

Re: RFR: 8279508: Auto-vectorize Math.round API [v15]

2022-03-12 Thread Jatin Bhateja
und_float | 1024.00 | 825.99 | 4754.66 | 5.76 | > 751.83 | 2274.13 | 3.02 > FpRoundingBenchmark.test_round_float | 2048.00 | 412.22 | 2490.09 | 6.04 | > 388.52 | 1334.18 | 3.43 > > > Kindly review and share your feedback. > > Best Regards, > Jatin Jatin Bhateja h

Re: RFR: 8279508: Auto-vectorize Math.round API [v14]

2022-03-11 Thread Jatin Bhateja
und_float | 1024.00 | 825.99 | 4754.66 | 5.76 | > 751.83 | 2274.13 | 3.02 > FpRoundingBenchmark.test_round_float | 2048.00 | 412.22 | 2490.09 | 6.04 | > 388.52 | 1334.18 | 3.43 > > > Kindly review and share your feedback. > > Best Regards, > Jatin Jatin Bhateja h

Re: RFR: 8279508: Auto-vectorize Math.round API [v9]

2022-03-11 Thread Jatin Bhateja
On Thu, 10 Mar 2022 14:29:36 GMT, Joe Darcy wrote: >> Hi @jddarcy , >> >> Test has been modified on the same lines using generic options which >> manipulate compilation thresholds and agnostic to target platforms. >> >> * @run main/othervm -XX:Tier3CompileThreshold=100 >>

Re: RFR: 8279508: Auto-vectorize Math.round API [v13]

2022-03-10 Thread Jatin Bhateja
und_float | 1024.00 | 825.99 | 4754.66 | 5.76 | > 751.83 | 2274.13 | 3.02 > FpRoundingBenchmark.test_round_float | 2048.00 | 412.22 | 2490.09 | 6.04 | > 388.52 | 1334.18 | 3.43 > > > Kindly review and share your feedback. > > Best Regards, > Jatin Jatin Bhateja h

Re: RFR: 8279508: Auto-vectorize Math.round API [v12]

2022-03-09 Thread Jatin Bhateja
und_float | 1024.00 | 825.99 | 4754.66 | 5.76 | > 751.83 | 2274.13 | 3.02 > FpRoundingBenchmark.test_round_float | 2048.00 | 412.22 | 2490.09 | 6.04 | > 388.52 | 1334.18 | 3.43 > > > Kindly review and share your feedback. > > Best Regards, > Jatin Jatin Bhateja has u

Re: RFR: 8279508: Auto-vectorize Math.round API [v11]

2022-03-06 Thread Jatin Bhateja
On Sun, 6 Mar 2022 09:31:27 GMT, Andrew Haley wrote: >> Jatin Bhateja has updated the pull request incrementally with one additional >> commit since the last revision: >> >> 8279508: Removing +LogCompilation flag. > > src/hotspot/cpu/x86/c2_MacroAssembler

Re: RFR: 8279508: Auto-vectorize Math.round API [v9]

2022-03-04 Thread Jatin Bhateja
On Fri, 4 Mar 2022 06:06:52 GMT, Joe Darcy wrote: >> test/jdk/java/lang/Math/RoundTests.java line 32: >> >>> 30: public static void main(String... args) { >>> 31: int failures = 0; >>> 32: for (int i = 0; i < 10; i++) { >> >> Is there an idiom to trigger the

Re: RFR: 8279508: Auto-vectorize Math.round API [v2]

2022-03-02 Thread Jatin Bhateja
On Wed, 19 Jan 2022 22:09:26 GMT, Joe Darcy wrote: >> Jatin Bhateja has updated the pull request incrementally with one additional >> commit since the last revision: >> >> 8279508: Adding a test for scalar intrinsification. > > The testing for this PR doesn't lo

Re: RFR: 8279508: Auto-vectorize Math.round API [v11]

2022-03-01 Thread Jatin Bhateja
und_float | 1024.00 | 825.99 | 4754.66 | 5.76 | > 751.83 | 2274.13 | 3.02 > FpRoundingBenchmark.test_round_float | 2048.00 | 412.22 | 2490.09 | 6.04 | > 388.52 | 1334.18 | 3.43 > > > Kindly review and share your feedback. > > Best Regards, > Jatin Jatin Bhateja h

Re: RFR: 8279508: Auto-vectorize Math.round API [v10]

2022-03-01 Thread Jatin Bhateja
und_float | 1024.00 | 825.99 | 4754.66 | 5.76 | > 751.83 | 2274.13 | 3.02 > FpRoundingBenchmark.test_round_float | 2048.00 | 412.22 | 2490.09 | 6.04 | > 388.52 | 1334.18 | 3.43 > > > Kindly review and share your feedback. > > Best Regards, > Jatin Jatin Bhateja h

Re: RFR: 8279508: Auto-vectorize Math.round API [v9]

2022-02-25 Thread Jatin Bhateja
On Fri, 25 Feb 2022 06:22:42 GMT, Jatin Bhateja wrote: >> Summary of changes: >> - Intrinsify Math.round(float) and Math.round(double) APIs. >> - Extend auto-vectorizer to infer vector operations on encountering scalar >> IR nodes for above intrinsics. >> - Test

Re: RFR: 8279508: Auto-vectorize Math.round API [v9]

2022-02-24 Thread Jatin Bhateja
und_float | 1024.00 | 825.99 | 4754.66 | 5.76 | > 751.83 | 2274.13 | 3.02 > FpRoundingBenchmark.test_round_float | 2048.00 | 412.22 | 2490.09 | 6.04 | > 388.52 | 1334.18 | 3.43 > > > Kindly review and share your feedback. > > Best Regards, > Jatin Jatin Bhateja h

Re: RFR: 8279508: Auto-vectorize Math.round API [v7]

2022-02-24 Thread Jatin Bhateja
On Thu, 24 Feb 2022 00:43:27 GMT, Sandhya Viswanathan wrote: > Also curious, how does the performance look with all these changes. Updated new perf numbers. - PR: https://git.openjdk.java.net/jdk/pull/7094

Re: RFR: 8282221: x86 intrinsics for divideUnsigned and remainderUnsigned methods in java.lang.Integer and java.lang.Long [v4]

2022-02-24 Thread Jatin Bhateja
On Thu, 24 Feb 2022 02:43:46 GMT, Vamsi Parasa wrote: >> Optimizes the divideUnsigned() and remainderUnsigned() methods in >> java.lang.Integer and java.lang.Long classes using x86 intrinsics. This >> change shows 3x improvement for Integer methods and upto 25% improvement for >> Long. This

Re: RFR: 8279508: Auto-vectorize Math.round API [v7]

2022-02-24 Thread Jatin Bhateja
On Thu, 24 Feb 2022 01:43:27 GMT, Sandhya Viswanathan wrote: >> Jatin Bhateja has updated the pull request incrementally with one additional >> commit since the last revision: >> >> 8279508: Review comments resolved. > > src/hotspot/cpu/x86/macroAssembler

Re: RFR: 8279508: Auto-vectorize Math.round API [v8]

2022-02-24 Thread Jatin Bhateja
> Kindly review and share your feedback. > > Best Regards, > Jatin Jatin Bhateja has updated the pull request incrementally with one additional commit since the last revision: 8279508: Review comments resolved. - Changes: - all: https://git.openjdk.java.net/jdk/p

Re: RFR: 8279508: Auto-vectorize Math.round API [v7]

2022-02-23 Thread Jatin Bhateja
> Kindly review and share your feedback. > > Best Regards, > Jatin Jatin Bhateja has updated the pull request incrementally with one additional commit since the last revision: 8279508: Review comments resolved. - Changes: - all: https://git.openjdk.java.net/jdk/p

Re: RFR: 8279508: Auto-vectorize Math.round API [v6]

2022-02-23 Thread Jatin Bhateja
On Wed, 23 Feb 2022 01:31:24 GMT, Sandhya Viswanathan wrote: >> Jatin Bhateja has updated the pull request incrementally with one additional >> commit since the last revision: >> >> 8279508: Fixing for windows failure. > > src/hotspot/cpu/x86/c2_MacroAssembler

Re: RFR: 8282221: x86 intrinsics for divideUnsigned and remainderUnsigned methods in java.lang.Integer and java.lang.Long

2022-02-22 Thread Jatin Bhateja
On Tue, 22 Feb 2022 09:24:47 GMT, Vamsi Parasa wrote: > Optimizes the divideUnsigned() and remainderUnsigned() methods in > java.lang.Integer and java.lang.Long classes using x86 intrinsics. This > change shows 3x improvement for Integer methods and upto 25% improvement for > Long. This

Re: RFR: 8279508: Auto-vectorize Math.round API [v6]

2022-02-17 Thread Jatin Bhateja
> Kindly review and share your feedback. > > Best Regards, > Jatin Jatin Bhateja has updated the pull request incrementally with one additional commit since the last revision: 8279508: Fixing for windows failure. - Changes: - all: https://git.openjdk.java.net/jdk/p

Re: RFR: 8279508: Auto-vectorize Math.round API [v5]

2022-02-16 Thread Jatin Bhateja
On Wed, 16 Feb 2022 12:30:27 GMT, Jatin Bhateja wrote: >> Summary of changes: >> - Intrinsify Math.round(float) and Math.round(double) APIs. >> - Extend auto-vectorizer to infer vector operations on encountering scalar >> IR nodes for above intrinsics. >> - Test

Re: RFR: 8279508: Auto-vectorize Math.round API [v3]

2022-02-16 Thread Jatin Bhateja
On Wed, 16 Feb 2022 12:26:45 GMT, Jatin Bhateja wrote: >>> > Hi, IIRC for evex encoding you can embed the RC control bit directly in >>> > the evex prefix, removing the need to rely on global MXCSR register. >>> > Thanks. >>> >>>

Re: RFR: 8279508: Auto-vectorize Math.round API [v5]

2022-02-16 Thread Jatin Bhateja
und_float | 1024.00 | 825.69 | 3592.54 | 4.35 | > 825.32 | 1836.42 | 2.23 > FpRoundingBenchmark.test_round_float | 2048.00 | 388.55 | 1895.77 | 4.88 | > 412.31 | 945.82 | 2.29 > > > Kindly review and share your feedback. > > Best Regards, > Jatin Jatin Bhateja has u

Re: RFR: 8279508: Auto-vectorize Math.round API [v3]

2022-02-16 Thread Jatin Bhateja
On Mon, 14 Feb 2022 17:14:10 GMT, Jatin Bhateja wrote: >> That pseudocode would make a very useful comment too. This whole patch is >> very thinly commented. > >> > Hi, IIRC for evex encoding you can embed the RC control bit directly in >> > the evex prefix, re

Re: RFR: 8279508: Auto-vectorize Math.round API [v4]

2022-02-16 Thread Jatin Bhateja
und_float | 1024.00 | 825.69 | 3592.54 | 4.35 | > 825.32 | 1836.42 | 2.23 > FpRoundingBenchmark.test_round_float | 2048.00 | 388.55 | 1895.77 | 4.88 | > 412.31 | 945.82 | 2.29 > > > Kindly review and share your feedback. > > Best Regards, > Jatin Jatin Bhateja h

Re: RFR: 8279508: Auto-vectorize Math.round API [v3]

2022-02-14 Thread Jatin Bhateja
On Mon, 14 Feb 2022 09:12:54 GMT, Andrew Haley wrote: >>> What does this do? Comment, even pseudo code, would be nice. >> >> Thanks @theRealAph , I shall append the comments over the routine. >> BTW, entire rounding algorithm can also be implemented using Vector API >> which can perform

Re: RFR: 8279508: Auto-vectorize Math.round API [v3]

2022-02-13 Thread Jatin Bhateja
On Sun, 13 Feb 2022 13:08:41 GMT, Jatin Bhateja wrote: >> src/hotspot/cpu/x86/c2_MacroAssembler_x86.cpp line 4066: >> >>> 4064: } >>> 4065: >>> 4066: void >>> C2_MacroAssembler::vector_cast_double_special_cases_evex(XMMRegister dst, >>

Re: RFR: 8279508: Auto-vectorize Math.round API [v3]

2022-02-13 Thread Jatin Bhateja
On Sun, 13 Feb 2022 10:58:19 GMT, Andrew Haley wrote: >> Jatin Bhateja has updated the pull request with a new target base due to a >> merge or a rebase. The incremental webrev excludes the unrelated changes >> brought in by the merge/rebase. The pull request contai

Re: RFR: 8279508: Auto-vectorize Math.round API [v3]

2022-02-12 Thread Jatin Bhateja
und_float | 1024.00 | 825.69 | 3592.54 | 4.35 | > 825.32 | 1836.42 | 2.23 > FpRoundingBenchmark.test_round_float | 2048.00 | 388.55 | 1895.77 | 4.88 | > 412.31 | 945.82 | 2.29 > > > Kindly review and share your feedback. > > Best Regards, > Jatin Jatin Bha

Re: RFR: 8279508: Auto-vectorize Math.round API [v2]

2022-02-12 Thread Jatin Bhateja
On Fri, 21 Jan 2022 00:49:04 GMT, Sandhya Viswanathan wrote: > The JVM currently initializes the x86 mxcsr to round to nearest even, see > below in stubGenerator_x86_64.cpp: // Round to nearest (even), 64-bit mode, > exceptions masked StubRoutines::x86::_mxcsr_std = 0x1F80; The above works

Re: RFR: 8278173: [vectorapi] Add x64 intrinsics for unsigned (zero extended) casts

2022-02-10 Thread Jatin Bhateja
On Sat, 5 Feb 2022 15:34:08 GMT, Quan Anh Mai wrote: > Hi, > > This patch implements the unsigned upcast intrinsics in x86, which are used > in vector lane-wise reinterpreting operations. > > Thank you very much. src/hotspot/cpu/x86/x86.ad line 7288: > 7286: break; > 7287:

Re: RFR: 8279508: Auto-vectorize Math.round API [v2]

2022-01-19 Thread Jatin Bhateja
3 | 12.48279907 > > Kindly review and share your feedback. > > Best Regards, > Jatin Jatin Bhateja has updated the pull request incrementally with one additional commit since the last revision: 8279508: Adding a test for scalar intrinsification.

Re: RFR: 8279508: Auto-vectorize Math.round API

2022-01-15 Thread Jatin Bhateja
On Sun, 16 Jan 2022 02:23:15 GMT, Quan Anh Mai wrote: > Hi, did we have tests for the scalar intrinsification already? Thanks. Verification is done against scalar rounding operation.

RFR: 8279508: Auto-vectorize Math.round API

2022-01-14 Thread Jatin Bhateja
Summary of changes: - Intrinsify Math.round(float) and Math.round(double) APIs. - Extend auto-vectorizer to infer vector operations on encountering scalar IR nodes for above intrinsics. - Test creation using new IR testing framework. Following are the performance number of a JMH micro included

Integrated: 8273322: Enhance macro logic optimization for masked logic operations.

2022-01-06 Thread Jatin Bhateja
On Mon, 20 Dec 2021 13:33:01 GMT, Jatin Bhateja wrote: > Patch extends existing macrologic inferencing algorithm to handle masked > logic operations. > > Existing algorithm: > > 1. Identify logic cone roots. > 2. Packs parent and logic child nodes into a MacroLo

Re: RFR: 8273322: Enhance macro logic optimization for masked logic operations. [v5]

2022-01-06 Thread Jatin Bhateja
cOpts.partiallyMaskedLogicOperationsLong256 > | 1024 | 996.906 | 1013.649 | 1.016794964 > o.o.b.jdk.incubator.vector.MaskedLogicOpts.partiallyMaskedLogicOperationsLong512 > | 256 | 2045.594 | 2048.966 | 1.001648421 > o.o.b.jdk.incubator.vector.MaskedLogicOpts.partiallyMaskedLog

Re: RFR: 8273322: Enhance macro logic optimization for masked logic operations. [v4]

2022-01-06 Thread Jatin Bhateja
On Thu, 6 Jan 2022 17:39:20 GMT, Sandhya Viswanathan wrote: >> Jatin Bhateja has updated the pull request incrementally with one additional >> commit since the last revision: >> >> 8273322: Review comments resolution. > > test/hotspot/jtreg/compiler/vectorapi/T

Re: RFR: 8273322: Enhance macro logic optimization for masked logic operations. [v3]

2022-01-05 Thread Jatin Bhateja
On Tue, 4 Jan 2022 15:11:47 GMT, Jatin Bhateja wrote: >> Patch extends existing macrologic inferencing algorithm to handle masked >> logic operations. >> >> Existing algorithm: >> >> 1. Identify logic cone roots. >> 2. Packs parent and logic child

Re: RFR: 8273322: Enhance macro logic optimization for masked logic operations. [v4]

2022-01-05 Thread Jatin Bhateja
cOpts.partiallyMaskedLogicOperationsLong256 > | 1024 | 996.906 | 1013.649 | 1.016794964 > o.o.b.jdk.incubator.vector.MaskedLogicOpts.partiallyMaskedLogicOperationsLong512 > | 256 | 2045.594 | 2048.966 | 1.001648421 > o.o.b.jdk.incubator.vector.MaskedLogicOpts.partiallyMasked

Re: RFR: 8273322: Enhance macro logic optimization for masked logic operations. [v2]

2022-01-05 Thread Jatin Bhateja
On Tue, 4 Jan 2022 02:25:36 GMT, Vladimir Kozlov wrote: > I think whole "Bitwise operation packing optimization" code should be moved > out from `compile.cpp`. May be to `vectornode.cpp where `MacroLogicVNode` > code is located. > Hi @vnkozlov , Yes we can also extended

Re: RFR: 8273322: Enhance macro logic optimization for masked logic operations. [v3]

2022-01-04 Thread Jatin Bhateja
cOpts.partiallyMaskedLogicOperationsLong256 > | 1024 | 996.906 | 1013.649 | 1.016794964 > o.o.b.jdk.incubator.vector.MaskedLogicOpts.partiallyMaskedLogicOperationsLong512 > | 256 | 2045.594 | 2048.966 | 1.001648421 > o.o.b.jdk.incubator.vector.MaskedLogicOpts.partiallyMaske

Re: RFR: 8273322: Enhance macro logic optimization for masked logic operations. [v2]

2022-01-04 Thread Jatin Bhateja
On Tue, 4 Jan 2022 02:21:35 GMT, Vladimir Kozlov wrote: >> Jatin Bhateja has updated the pull request with a new target base due to a >> merge or a rebase. The incremental webrev excludes the unrelated changes >> brought in by the merge/rebase. The pull request conta

Re: RFR: 8273322: Enhance macro logic optimization for masked logic operations. [v2]

2022-01-03 Thread Jatin Bhateja
cOpts.partiallyMaskedLogicOperationsLong256 > | 1024 | 996.906 | 1013.649 | 1.016794964 > o.o.b.jdk.incubator.vector.MaskedLogicOpts.partiallyMaskedLogicOperationsLong512 > | 256 | 2045.594 | 2048.966 | 1.001648421 > o.o.b.jdk.incubator.vector.MaskedLogicOpts.partiallyMaskedLogicOper

RFR: 8273322: Enhance macro logic optimization for masked logic operations.

2021-12-20 Thread Jatin Bhateja
Patch extends existing macrologic inferencing algorithm to handle masked logic operations. Existing algorithm: 1. Identify logic cone roots. 2. Packs parent and logic child nodes into a MacroLogic node in bottom up traversal if input constraint are met. i.e. maximum number of inputs which a

Re: RFR: 8271368: [BACKOUT] JDK-8266054 VectorAPI rotate operation optimization

2021-07-28 Thread Jatin Bhateja
On Wed, 28 Jul 2021 05:35:59 GMT, Vladimir Kozlov wrote: > Backout the following changes due to vector tests failures in tier 2 and > later: > [JDK-8266054](https://bugs.openjdk.java.net/browse/JDK-8266054) VectorAPI > rotate operation optimization > > Changes also caused copyright header

Integrated: 8266054: VectorAPI rotate operation optimization

2021-07-27 Thread Jatin Bhateja
On Tue, 27 Apr 2021 17:56:04 GMT, Jatin Bhateja wrote: > Current VectorAPI Java side implementation expresses rotateLeft and > rotateRight operation using following operations:- > > vec1 = lanewise(VectorOperators.LSHL, n) > vec2 = lanewise(VectorOperators.LSHR, n) >

Re: RFR: 8266054: VectorAPI rotate operation optimization [v13]

2021-07-27 Thread Jatin Bhateja
On Tue, 27 Jul 2021 00:24:52 GMT, Sandhya Viswanathan wrote: >> Jatin Bhateja has updated the pull request with a new target base due to a >> merge or a rebase. The pull request now contains 19 commits: >> >> - 8266054: Re-designing benchmark to remove noise. &g

Re: RFR: 8266054: VectorAPI rotate operation optimization [v10]

2021-07-27 Thread Jatin Bhateja
On Tue, 27 Jul 2021 02:52:13 GMT, Eric Liu wrote: >> @sviswa7, SLP flow will either have a constant 8bit shift value or a >> variable shift present in vector, this also include broadcasted non-constant >> shift value or a shift value beyond 8 bit. > > It would be better comment here, since the

Re: RFR: 8266054: VectorAPI rotate operation optimization [v13]

2021-07-27 Thread Jatin Bhateja
On Tue, 27 Jul 2021 01:54:01 GMT, Sandhya Viswanathan wrote: >> Jatin Bhateja has updated the pull request with a new target base due to a >> merge or a rebase. The pull request now contains 19 commits: >> >> - 8266054: Re-designing benchmark to remove noise. &g

Re: RFR: 8266054: VectorAPI rotate operation optimization [v10]

2021-07-26 Thread Jatin Bhateja
On Mon, 26 Jul 2021 17:19:07 GMT, Sandhya Viswanathan wrote: >> And'ing with shift_mask is already done on Java API side implementation >> before making a call to intrinsic rountine. > > @jatin-bhateja This question is still pending. @sviswa7, SLP flow will either have a c

Re: RFR: 8266054: VectorAPI rotate operation optimization [v10]

2021-07-26 Thread Jatin Bhateja
On Mon, 26 Jul 2021 17:19:07 GMT, Sandhya Viswanathan wrote: >> And'ing with shift_mask is already done on Java API side implementation >> before making a call to intrinsic rountine. > > @jatin-bhateja This question is still pending. Other than VectorAPI , SLP also inf

Re: RFR: 8266054: VectorAPI rotate operation optimization [v12]

2021-07-18 Thread Jatin Bhateja
66.01 | > -1.33 | 21140.67 | 21970.03 | 3.92 > RotateBenchmark.testRotateRightS | 256.00 | 31.00 | 11676.46 | 11358.64 | > -2.72 | 11204.90 | 11213.48 | 0.08 > RotateBenchmark.testRotateRightS | 256.00 | 31.00 | 5728.20 | 5772.49 | 0.77 > | 5594.33 | 5544.25 | -0.90 > Ro

Re: RFR: 8266054: VectorAPI rotate operation optimization [v11]

2021-07-18 Thread Jatin Bhateja
On Sun, 18 Jul 2021 20:28:34 GMT, Jatin Bhateja wrote: >> Current VectorAPI Java side implementation expresses rotateLeft and >> rotateRight operation using following operations:- >> >> vec1 = lanewise(VectorOperators.LSHL, n) >> vec2 = lanewise(Vecto

Re: RFR: 8266054: VectorAPI rotate operation optimization [v10]

2021-07-18 Thread Jatin Bhateja
On Fri, 16 Jul 2021 00:52:21 GMT, Sandhya Viswanathan wrote: >> Jatin Bhateja has updated the pull request with a new target base due to a >> merge or a rebase. The pull request now contains 15 commits: >> >> - 8266054: Incorporating styling changes based on rev

Re: RFR: 8266054: VectorAPI rotate operation optimization [v11]

2021-07-18 Thread Jatin Bhateja
66.01 | > -1.33 | 21140.67 | 21970.03 | 3.92 > RotateBenchmark.testRotateRightS | 256.00 | 31.00 | 11676.46 | 11358.64 | > -2.72 | 11204.90 | 11213.48 | 0.08 > RotateBenchmark.testRotateRightS | 256.00 | 31.00 | 5728.20 | 5772.49 | 0.77 > | 5594.33 | 5544.25 | -0.90 > RotateBenchm

Re: RFR: 8266054: VectorAPI rotate operation optimization [v10]

2021-07-15 Thread Jatin Bhateja
66.01 | > -1.33 | 21140.67 | 21970.03 | 3.92 > RotateBenchmark.testRotateRightS | 256.00 | 31.00 | 11676.46 | 11358.64 | > -2.72 | 11204.90 | 11213.48 | 0.08 > RotateBenchmark.testRotateRightS | 256.00 | 31.00 | 5728.20 | 5772.49 | 0.77 > | 5594.33 | 5544.25 | -0.90 > RotateBen

Re: RFR: 8266054: VectorAPI rotate operation optimization [v9]

2021-06-30 Thread Jatin Bhateja
66.01 | > -1.33 | 21140.67 | 21970.03 | 3.92 > RotateBenchmark.testRotateRightS | 256.00 | 31.00 | 11676.46 | 11358.64 | > -2.72 | 11204.90 | 11213.48 | 0.08 > RotateBenchmark.testRotateRightS | 256.00 | 31.00 | 5728.20 | 5772.49 | 0.77 > | 5594.33 | 5544.25 | -0.90 > Rotate

  1   2   >