On Tue, 7 Jun 2022 04:29:40 GMT, Xiaohong Gong wrote:
>> Currently the vector load with mask when the given index happens out of the
>> array boundary is implemented with pure java scalar code to avoid the IOOBE
>> (IndexOutOfBoundaryException). This is necessary for architectures that do
>>
On Tue, 7 Jun 2022 02:22:53 GMT, Xiaohong Gong wrote:
>> test/micro/org/openjdk/bench/jdk/incubator/vector/LoadMaskedIOOBEBenchmark.java
>> line 97:
>>
>>> 95: public void byteLoadArrayMaskIOOBE() {
>>> 96: for (int i = 0; i < inSize; i += bspecies.length()) {
>>> 97:
On Thu, 2 Jun 2022 03:27:59 GMT, Xiaohong Gong wrote:
>> Currently the vector load with mask when the given index happens out of the
>> array boundary is implemented with pure java scalar code to avoid the IOOBE
>> (IndexOutOfBoundaryException). This is necessary for architectures that do
>>
On Wed, 27 Apr 2022 11:03:48 GMT, Jatin Bhateja wrote:
> Hi All,
>
> Patch adds the planned support for new vector operations and APIs targeted
> for [JEP 426: Vector API (Fourth
> Incubator).](https://bugs.openjdk.java.net/browse/JDK-8280173)
>
> Following is the bri
On Wed, 25 May 2022 06:29:23 GMT, Jatin Bhateja wrote:
>> Hi All,
>>
>> Patch adds the planned support for new vector operations and APIs targeted
>> for [JEP 426: Vector API (Fourth
>> Incubator).](https://bugs.openjdk.java.net/browse/JDK-8280173)
>>
On Wed, 25 May 2022 06:25:53 GMT, Jatin Bhateja wrote:
>> src/hotspot/cpu/x86/assembler_x86.cpp line 8173:
>>
>>> 8171:
>>> 8172: void Assembler::vinsertf32x4(XMMRegister dst, XMMRegister nds,
>>> XMMRegister src, uint8_t imm8) {
>>> 8173: ass
On Wed, 25 May 2022 05:50:23 GMT, Jatin Bhateja wrote:
>> Hi All,
>>
>> Patch adds the planned support for new vector operations and APIs targeted
>> for [JEP 426: Vector API (Fourth
>> Incubator).](https://bugs.openjdk.java.net/browse/JDK-8280173)
>>
On Mon, 23 May 2022 22:17:40 GMT, Vladimir Kozlov wrote:
>> Jatin Bhateja has updated the pull request incrementally with one additional
>> commit since the last revision:
>>
>> 8284960: Integrating incremental patches.
>
> src/hotspot/cpu/x86/assembler
over AARCH64 and X86 targets different AVX levels.
>
> Kindly review and share your feedback.
>
> Best Regards,
> Jatin
Jatin Bhateja has updated the pull request with a new target base due to a
merge or a rebase. The pull request now contains 20 commits:
- 8284960: Post merg
over AARCH64 and X86 targets different AVX levels.
>
> Kindly review and share your feedback.
>
> Best Regards,
> Jatin
Jatin Bhateja has updated the pull request incrementally with one additional
commit since the last revision:
8284960: Review comments resolved.
--
On Thu, 12 May 2022 23:56:49 GMT, Vladimir Ivanov wrote:
>> Jatin Bhateja has updated the pull request with a new target base due to a
>> merge or a rebase. The pull request now contains 11 commits:
>>
>> - Merge branch 'master' of http://github.com/openjdk/jdk into J
On Thu, 19 May 2022 21:19:49 GMT, Paul Sandoz wrote:
>> Jatin Bhateja has updated the pull request with a new target base due to a
>> merge or a rebase. The pull request now contains 16 commits:
>>
>> - Merge branch 'master' of http://github.com/openjdk/jdk into J
over AARCH64 and X86 targets different AVX levels.
>
> Kindly review and share your feedback.
>
> Best Regards,
> Jatin
Jatin Bhateja has updated the pull request incrementally with one additional
commit since the last revision:
8284960: Integrating incremental patches.
On Thu, 19 May 2022 15:33:49 GMT, Jatin Bhateja wrote:
>> Do you mean it's important to apply the transformation at the right node
>> (pick the right node as the root) and it is hard to make a decision during
>> GVN?
>
> Yes, that what I meant, but wit
over AARCH64 and X86 targets different AVX levels.
>
> Kindly review and share your feedback.
>
> Best Regards,
> Jatin
Jatin Bhateja has updated the pull request with a new target base due to a
merge or a rebase. The pull request now contains 16 commits:
- Merge branch
On Wed, 18 May 2022 23:35:54 GMT, Vladimir Ivanov wrote:
>> It was an attempt to facilitate in-lining of these APIs over targets which
>> do not intrinsify them. I agree its not a generic fix since three APIs are
>> piggybacking on same entry point and without the knowledge of opcode it will
On Wed, 18 May 2022 23:28:22 GMT, Vladimir Ivanov wrote:
>> Its more of a chicken-egg problem here, for masked reverse operation,
>> Reverse IR node is followed by a Blend Node, thus in such a case doing an
>> eager Identity transform in Reverse::Identity will not work, also deferring
>> this
over AARCH64 and X86 targets different AVX levels.
>
> Kindly review and share your feedback.
>
> Best Regards,
> Jatin
Jatin Bhateja has updated the pull request incrementally with one additional
commit since the last revision:
8284960: Adding --enable-preview
over AARCH64 and X86 targets different AVX levels.
>
> Kindly review and share your feedback.
>
> Best Regards,
> Jatin
Jatin Bhateja has updated the pull request with a new target base due to a
merge or a rebase. The pull request now contains 13 commits:
- Merge branch
On Thu, 12 May 2022 22:48:26 GMT, Vladimir Ivanov wrote:
>> Jatin Bhateja has updated the pull request incrementally with one additional
>> commit since the last revision:
>>
>> 8284960: Review comments resolution.
>
> src/hotspot/cpu/x86/stubGenerator_x8
On Thu, 12 May 2022 22:40:50 GMT, Vladimir Ivanov wrote:
>> Jatin Bhateja has updated the pull request with a new target base due to a
>> merge or a rebase. The pull request now contains 11 commits:
>>
>> - Merge branch 'master' of http://github.com/openjdk/jdk into J
over AARCH64 and X86 targets different AVX levels.
>
> Kindly review and share your feedback.
>
> Best Regards,
> Jatin
Jatin Bhateja has updated the pull request incrementally with one additional
commit since the last revision:
8284960: Review comments resolution.
-
over AARCH64 and X86 targets different AVX levels.
>
> Kindly review and share your feedback.
>
> Best Regards,
> Jatin
Jatin Bhateja has updated the pull request with a new target base due to a
merge or a rebase. The pull request now contains 11 commits:
- Merge branch 'ma
On Thu, 5 May 2022 05:47:47 GMT, Jatin Bhateja wrote:
>> Hi All,
>>
>> Patch adds the planned support for new vector operations and APIs targeted
>> for [JEP 426: Vector API (Fourth
>> Incubator).](https://bugs.openjdk.java.net/browse/JDK-8280173)
>>
over AARCH64 and X86 targets different AVX levels.
>
> Kindly review and share your feedback.
>
> Best Regards,
> Jatin
Jatin Bhateja has updated the pull request with a new target base due to a
merge or a rebase. The pull request now contains 10 commits:
- 8284960: C
On Thu, 5 May 2022 03:17:35 GMT, Xiaohong Gong wrote:
>> src/hotspot/share/opto/vectorIntrinsics.cpp line 1363:
>>
>>> 1361: // Use the vector blend to implement the masked store. The
>>> biased elements are the original
>>> 1362: // values in the memory.
>>> 1363: Node*
Hi All,
Patch adds the planned support for new vector operations and APIs targeted for
[JEP 426: Vector API (Fourth
Incubator).](https://bugs.openjdk.java.net/browse/JDK-8280173)
Following is the brief summary of changes:-
1) Extends the scope of existing lanewise API for following new
On Wed, 20 Apr 2022 02:44:39 GMT, Xiaohong Gong wrote:
>>> The blend should be with the intended-to-store vector, so that masked lanes
>>> contain the need-to-store elements and unmasked lanes contain the loaded
>>> elements, which would be stored back, which results in unchanged values.
>>
On Sun, 17 Apr 2022 14:35:14 GMT, Jie Fu wrote:
>> According to the Vector API doc, the LSHR operator computes
>> a>>>(n&(ESIZE*8-1))
Documentation is correct if viewed strictly in context of subword vector lane,
JVM internally promotes/sign extends subword type scalar variables into int
On Thu, 31 Mar 2022 03:53:15 GMT, Xiaohong Gong wrote:
>> Yeah, maybe I misunderstood what you mean. So maybe the masked store
>> `(store(src, m))` could be implemented with:
>>
>> 1) v1 = load
>> 2) v2 = blend(load, src, m)
>> 3) store(v2)
>>
>> Let's record this a JBS and fix it with a
On Mon, 4 Apr 2022 07:24:12 GMT, Vamsi Parasa wrote:
>> Also need a jtreg test for this.
>
>> Also need a jtreg test for this.
>
> Thanks Sandhya for the review. Made the suggested changes and added jtreg
> tests as well.
Hi @vamsi-parasa , thanks for addressing my comments, looks good to me
On Wed, 6 Apr 2022 06:02:07 GMT, Vamsi Parasa wrote:
>> Optimizes the divideUnsigned() and remainderUnsigned() methods in
>> java.lang.Integer and java.lang.Long classes using x86 intrinsics. This
>> change shows 3x improvement for Integer methods and upto 25% improvement for
>> Long. This
On Sun, 27 Mar 2022 06:15:34 GMT, Vamsi Parasa wrote:
> Implements x86 intrinsics for compare() method in java.lang.Integer and
> java.lang.Long.
src/hotspot/cpu/x86/x86_64.ad line 12107:
> 12105: instruct compareSignedI_rReg(rRegI dst, rRegI op1, rRegI op2, rRegI
> tmp, rFlagsReg cr)
>
On Wed, 23 Mar 2022 06:55:50 GMT, Tobias Hartmann wrote:
>> Jatin Bhateja has updated the pull request with a new target base due to a
>> merge or a rebase. The pull request now contains 22 commits:
>>
>> - 8279508: Using an explicit scratch register since rscra
On Tue, 22 Mar 2022 01:55:38 GMT, Quan Anh Mai wrote:
>> A read from constant table will incur minimum of L1I access penalty to
>> access code blob or at worst even more if data is not present in first level
>> cache. Change was done for replace vpbroadcastd with vbroadcastss because of
>>
On Mon, 21 Mar 2022 17:56:22 GMT, Quan Anh Mai wrote:
>> constant and register to register moves are never issued to execution ports,
>> rematerializing value rather than reading from memory will give better
>> performance.
>
> I have come across this a little bit. While `movl r, i` may not
On Mon, 14 Mar 2022 10:35:58 GMT, Tobias Hartmann wrote:
>> Jatin Bhateja has updated the pull request incrementally with one additional
>> commit since the last revision:
>>
>> 8279508: Windows build failure fix.
>
> `compiler/c2/cr6340864/TestFloatVect.java`
und_float | 1024.00 | 825.99 | 4754.66 | 5.76 |
> 751.83 | 2274.13 | 3.02
> FpRoundingBenchmark.test_round_float | 2048.00 | 412.22 | 2490.09 | 6.04 |
> 388.52 | 1334.18 | 3.43
>
>
> Kindly review and share your feedback.
>
> Best Regards,
> Jatin
Jatin Bhateja has u
On Mon, 14 Mar 2022 09:29:28 GMT, Andrew Haley wrote:
>> Good suggestion, but as of now we are not using vector calling conventions
>> for stubs.
>
> I don't understand this comment. If the stub is only to be used by you, then
> you can determine your own calling convention.
We are passing
und_float | 1024.00 | 825.99 | 4754.66 | 5.76 |
> 751.83 | 2274.13 | 3.02
> FpRoundingBenchmark.test_round_float | 2048.00 | 412.22 | 2490.09 | 6.04 |
> 388.52 | 1334.18 | 3.43
>
>
> Kindly review and share your feedback.
>
> Best Regards,
> Jatin
Jatin Bhateja h
und_float | 1024.00 | 825.99 | 4754.66 | 5.76 |
> 751.83 | 2274.13 | 3.02
> FpRoundingBenchmark.test_round_float | 2048.00 | 412.22 | 2490.09 | 6.04 |
> 388.52 | 1334.18 | 3.43
>
>
> Kindly review and share your feedback.
>
> Best Regards,
> Jatin
Jatin Bhateja h
On Sun, 13 Mar 2022 00:06:07 GMT, Quan Anh Mai wrote:
>> src/hotspot/cpu/x86/c2_MacroAssembler_x86.cpp line 4161:
>>
>>> 4159: movl(scratch, 1056964608);
>>> 4160: movq(xtmp1, scratch);
>>> 4161: vbroadcastss(xtmp1, xtmp1, vec_enc);
>>
>> An `evpbroadcastd` would reduce this by one
On Sat, 12 Mar 2022 23:20:58 GMT, Quan Anh Mai wrote:
>> Jatin Bhateja has updated the pull request incrementally with one additional
>> commit since the last revision:
>>
>> 8279508: Creating separate test for round double under feature check.
>
> src/hotspot
und_float | 1024.00 | 825.99 | 4754.66 | 5.76 |
> 751.83 | 2274.13 | 3.02
> FpRoundingBenchmark.test_round_float | 2048.00 | 412.22 | 2490.09 | 6.04 |
> 388.52 | 1334.18 | 3.43
>
>
> Kindly review and share your feedback.
>
> Best Regards,
> Jatin
Jatin Bhateja h
und_float | 1024.00 | 825.99 | 4754.66 | 5.76 |
> 751.83 | 2274.13 | 3.02
> FpRoundingBenchmark.test_round_float | 2048.00 | 412.22 | 2490.09 | 6.04 |
> 388.52 | 1334.18 | 3.43
>
>
> Kindly review and share your feedback.
>
> Best Regards,
> Jatin
Jatin Bhateja h
On Thu, 10 Mar 2022 14:29:36 GMT, Joe Darcy wrote:
>> Hi @jddarcy ,
>>
>> Test has been modified on the same lines using generic options which
>> manipulate compilation thresholds and agnostic to target platforms.
>>
>> * @run main/othervm -XX:Tier3CompileThreshold=100
>>
und_float | 1024.00 | 825.99 | 4754.66 | 5.76 |
> 751.83 | 2274.13 | 3.02
> FpRoundingBenchmark.test_round_float | 2048.00 | 412.22 | 2490.09 | 6.04 |
> 388.52 | 1334.18 | 3.43
>
>
> Kindly review and share your feedback.
>
> Best Regards,
> Jatin
Jatin Bhateja h
und_float | 1024.00 | 825.99 | 4754.66 | 5.76 |
> 751.83 | 2274.13 | 3.02
> FpRoundingBenchmark.test_round_float | 2048.00 | 412.22 | 2490.09 | 6.04 |
> 388.52 | 1334.18 | 3.43
>
>
> Kindly review and share your feedback.
>
> Best Regards,
> Jatin
Jatin Bhateja has u
On Sun, 6 Mar 2022 09:31:27 GMT, Andrew Haley wrote:
>> Jatin Bhateja has updated the pull request incrementally with one additional
>> commit since the last revision:
>>
>> 8279508: Removing +LogCompilation flag.
>
> src/hotspot/cpu/x86/c2_MacroAssembler
On Fri, 4 Mar 2022 06:06:52 GMT, Joe Darcy wrote:
>> test/jdk/java/lang/Math/RoundTests.java line 32:
>>
>>> 30: public static void main(String... args) {
>>> 31: int failures = 0;
>>> 32: for (int i = 0; i < 10; i++) {
>>
>> Is there an idiom to trigger the
On Wed, 19 Jan 2022 22:09:26 GMT, Joe Darcy wrote:
>> Jatin Bhateja has updated the pull request incrementally with one additional
>> commit since the last revision:
>>
>> 8279508: Adding a test for scalar intrinsification.
>
> The testing for this PR doesn't lo
und_float | 1024.00 | 825.99 | 4754.66 | 5.76 |
> 751.83 | 2274.13 | 3.02
> FpRoundingBenchmark.test_round_float | 2048.00 | 412.22 | 2490.09 | 6.04 |
> 388.52 | 1334.18 | 3.43
>
>
> Kindly review and share your feedback.
>
> Best Regards,
> Jatin
Jatin Bhateja h
und_float | 1024.00 | 825.99 | 4754.66 | 5.76 |
> 751.83 | 2274.13 | 3.02
> FpRoundingBenchmark.test_round_float | 2048.00 | 412.22 | 2490.09 | 6.04 |
> 388.52 | 1334.18 | 3.43
>
>
> Kindly review and share your feedback.
>
> Best Regards,
> Jatin
Jatin Bhateja h
On Fri, 25 Feb 2022 06:22:42 GMT, Jatin Bhateja wrote:
>> Summary of changes:
>> - Intrinsify Math.round(float) and Math.round(double) APIs.
>> - Extend auto-vectorizer to infer vector operations on encountering scalar
>> IR nodes for above intrinsics.
>> - Test
und_float | 1024.00 | 825.99 | 4754.66 | 5.76 |
> 751.83 | 2274.13 | 3.02
> FpRoundingBenchmark.test_round_float | 2048.00 | 412.22 | 2490.09 | 6.04 |
> 388.52 | 1334.18 | 3.43
>
>
> Kindly review and share your feedback.
>
> Best Regards,
> Jatin
Jatin Bhateja h
On Thu, 24 Feb 2022 00:43:27 GMT, Sandhya Viswanathan
wrote:
> Also curious, how does the performance look with all these changes.
Updated new perf numbers.
-
PR: https://git.openjdk.java.net/jdk/pull/7094
On Thu, 24 Feb 2022 02:43:46 GMT, Vamsi Parasa wrote:
>> Optimizes the divideUnsigned() and remainderUnsigned() methods in
>> java.lang.Integer and java.lang.Long classes using x86 intrinsics. This
>> change shows 3x improvement for Integer methods and upto 25% improvement for
>> Long. This
On Thu, 24 Feb 2022 01:43:27 GMT, Sandhya Viswanathan
wrote:
>> Jatin Bhateja has updated the pull request incrementally with one additional
>> commit since the last revision:
>>
>> 8279508: Review comments resolved.
>
> src/hotspot/cpu/x86/macroAssembler
> Kindly review and share your feedback.
>
> Best Regards,
> Jatin
Jatin Bhateja has updated the pull request incrementally with one additional
commit since the last revision:
8279508: Review comments resolved.
-
Changes:
- all: https://git.openjdk.java.net/jdk/p
> Kindly review and share your feedback.
>
> Best Regards,
> Jatin
Jatin Bhateja has updated the pull request incrementally with one additional
commit since the last revision:
8279508: Review comments resolved.
-
Changes:
- all: https://git.openjdk.java.net/jdk/p
On Wed, 23 Feb 2022 01:31:24 GMT, Sandhya Viswanathan
wrote:
>> Jatin Bhateja has updated the pull request incrementally with one additional
>> commit since the last revision:
>>
>> 8279508: Fixing for windows failure.
>
> src/hotspot/cpu/x86/c2_MacroAssembler
On Tue, 22 Feb 2022 09:24:47 GMT, Vamsi Parasa wrote:
> Optimizes the divideUnsigned() and remainderUnsigned() methods in
> java.lang.Integer and java.lang.Long classes using x86 intrinsics. This
> change shows 3x improvement for Integer methods and upto 25% improvement for
> Long. This
> Kindly review and share your feedback.
>
> Best Regards,
> Jatin
Jatin Bhateja has updated the pull request incrementally with one additional
commit since the last revision:
8279508: Fixing for windows failure.
-
Changes:
- all: https://git.openjdk.java.net/jdk/p
On Wed, 16 Feb 2022 12:30:27 GMT, Jatin Bhateja wrote:
>> Summary of changes:
>> - Intrinsify Math.round(float) and Math.round(double) APIs.
>> - Extend auto-vectorizer to infer vector operations on encountering scalar
>> IR nodes for above intrinsics.
>> - Test
On Wed, 16 Feb 2022 12:26:45 GMT, Jatin Bhateja wrote:
>>> > Hi, IIRC for evex encoding you can embed the RC control bit directly in
>>> > the evex prefix, removing the need to rely on global MXCSR register.
>>> > Thanks.
>>>
>>>
und_float | 1024.00 | 825.69 | 3592.54 | 4.35 |
> 825.32 | 1836.42 | 2.23
> FpRoundingBenchmark.test_round_float | 2048.00 | 388.55 | 1895.77 | 4.88 |
> 412.31 | 945.82 | 2.29
>
>
> Kindly review and share your feedback.
>
> Best Regards,
> Jatin
Jatin Bhateja has u
On Mon, 14 Feb 2022 17:14:10 GMT, Jatin Bhateja wrote:
>> That pseudocode would make a very useful comment too. This whole patch is
>> very thinly commented.
>
>> > Hi, IIRC for evex encoding you can embed the RC control bit directly in
>> > the evex prefix, re
und_float | 1024.00 | 825.69 | 3592.54 | 4.35 |
> 825.32 | 1836.42 | 2.23
> FpRoundingBenchmark.test_round_float | 2048.00 | 388.55 | 1895.77 | 4.88 |
> 412.31 | 945.82 | 2.29
>
>
> Kindly review and share your feedback.
>
> Best Regards,
> Jatin
Jatin Bhateja h
On Mon, 14 Feb 2022 09:12:54 GMT, Andrew Haley wrote:
>>> What does this do? Comment, even pseudo code, would be nice.
>>
>> Thanks @theRealAph , I shall append the comments over the routine.
>> BTW, entire rounding algorithm can also be implemented using Vector API
>> which can perform
On Sun, 13 Feb 2022 13:08:41 GMT, Jatin Bhateja wrote:
>> src/hotspot/cpu/x86/c2_MacroAssembler_x86.cpp line 4066:
>>
>>> 4064: }
>>> 4065:
>>> 4066: void
>>> C2_MacroAssembler::vector_cast_double_special_cases_evex(XMMRegister dst,
>>
On Sun, 13 Feb 2022 10:58:19 GMT, Andrew Haley wrote:
>> Jatin Bhateja has updated the pull request with a new target base due to a
>> merge or a rebase. The incremental webrev excludes the unrelated changes
>> brought in by the merge/rebase. The pull request contai
und_float | 1024.00 | 825.69 | 3592.54 | 4.35 |
> 825.32 | 1836.42 | 2.23
> FpRoundingBenchmark.test_round_float | 2048.00 | 388.55 | 1895.77 | 4.88 |
> 412.31 | 945.82 | 2.29
>
>
> Kindly review and share your feedback.
>
> Best Regards,
> Jatin
Jatin Bha
On Fri, 21 Jan 2022 00:49:04 GMT, Sandhya Viswanathan
wrote:
> The JVM currently initializes the x86 mxcsr to round to nearest even, see
> below in stubGenerator_x86_64.cpp: // Round to nearest (even), 64-bit mode,
> exceptions masked StubRoutines::x86::_mxcsr_std = 0x1F80; The above works
On Sat, 5 Feb 2022 15:34:08 GMT, Quan Anh Mai wrote:
> Hi,
>
> This patch implements the unsigned upcast intrinsics in x86, which are used
> in vector lane-wise reinterpreting operations.
>
> Thank you very much.
src/hotspot/cpu/x86/x86.ad line 7288:
> 7286: break;
> 7287:
3 | 12.48279907
>
> Kindly review and share your feedback.
>
> Best Regards,
> Jatin
Jatin Bhateja has updated the pull request incrementally with one additional
commit since the last revision:
8279508: Adding a test for scalar intrinsification.
On Sun, 16 Jan 2022 02:23:15 GMT, Quan Anh Mai wrote:
> Hi, did we have tests for the scalar intrinsification already? Thanks.
Verification is done against scalar rounding operation.
Summary of changes:
- Intrinsify Math.round(float) and Math.round(double) APIs.
- Extend auto-vectorizer to infer vector operations on encountering scalar IR
nodes for above intrinsics.
- Test creation using new IR testing framework.
Following are the performance number of a JMH micro included
On Mon, 20 Dec 2021 13:33:01 GMT, Jatin Bhateja wrote:
> Patch extends existing macrologic inferencing algorithm to handle masked
> logic operations.
>
> Existing algorithm:
>
> 1. Identify logic cone roots.
> 2. Packs parent and logic child nodes into a MacroLo
cOpts.partiallyMaskedLogicOperationsLong256
> | 1024 | 996.906 | 1013.649 | 1.016794964
> o.o.b.jdk.incubator.vector.MaskedLogicOpts.partiallyMaskedLogicOperationsLong512
> | 256 | 2045.594 | 2048.966 | 1.001648421
> o.o.b.jdk.incubator.vector.MaskedLogicOpts.partiallyMaskedLog
On Thu, 6 Jan 2022 17:39:20 GMT, Sandhya Viswanathan
wrote:
>> Jatin Bhateja has updated the pull request incrementally with one additional
>> commit since the last revision:
>>
>> 8273322: Review comments resolution.
>
> test/hotspot/jtreg/compiler/vectorapi/T
On Tue, 4 Jan 2022 15:11:47 GMT, Jatin Bhateja wrote:
>> Patch extends existing macrologic inferencing algorithm to handle masked
>> logic operations.
>>
>> Existing algorithm:
>>
>> 1. Identify logic cone roots.
>> 2. Packs parent and logic child
cOpts.partiallyMaskedLogicOperationsLong256
> | 1024 | 996.906 | 1013.649 | 1.016794964
> o.o.b.jdk.incubator.vector.MaskedLogicOpts.partiallyMaskedLogicOperationsLong512
> | 256 | 2045.594 | 2048.966 | 1.001648421
> o.o.b.jdk.incubator.vector.MaskedLogicOpts.partiallyMasked
On Tue, 4 Jan 2022 02:25:36 GMT, Vladimir Kozlov wrote:
> I think whole "Bitwise operation packing optimization" code should be moved
> out from `compile.cpp`. May be to `vectornode.cpp where `MacroLogicVNode`
> code is located.
>
Hi @vnkozlov ,
Yes we can also extended
cOpts.partiallyMaskedLogicOperationsLong256
> | 1024 | 996.906 | 1013.649 | 1.016794964
> o.o.b.jdk.incubator.vector.MaskedLogicOpts.partiallyMaskedLogicOperationsLong512
> | 256 | 2045.594 | 2048.966 | 1.001648421
> o.o.b.jdk.incubator.vector.MaskedLogicOpts.partiallyMaske
On Tue, 4 Jan 2022 02:21:35 GMT, Vladimir Kozlov wrote:
>> Jatin Bhateja has updated the pull request with a new target base due to a
>> merge or a rebase. The incremental webrev excludes the unrelated changes
>> brought in by the merge/rebase. The pull request conta
cOpts.partiallyMaskedLogicOperationsLong256
> | 1024 | 996.906 | 1013.649 | 1.016794964
> o.o.b.jdk.incubator.vector.MaskedLogicOpts.partiallyMaskedLogicOperationsLong512
> | 256 | 2045.594 | 2048.966 | 1.001648421
> o.o.b.jdk.incubator.vector.MaskedLogicOpts.partiallyMaskedLogicOper
Patch extends existing macrologic inferencing algorithm to handle masked logic
operations.
Existing algorithm:
1. Identify logic cone roots.
2. Packs parent and logic child nodes into a MacroLogic node in bottom up
traversal if input constraint are met.
i.e. maximum number of inputs which a
On Wed, 28 Jul 2021 05:35:59 GMT, Vladimir Kozlov wrote:
> Backout the following changes due to vector tests failures in tier 2 and
> later:
> [JDK-8266054](https://bugs.openjdk.java.net/browse/JDK-8266054) VectorAPI
> rotate operation optimization
>
> Changes also caused copyright header
On Tue, 27 Apr 2021 17:56:04 GMT, Jatin Bhateja wrote:
> Current VectorAPI Java side implementation expresses rotateLeft and
> rotateRight operation using following operations:-
>
> vec1 = lanewise(VectorOperators.LSHL, n)
> vec2 = lanewise(VectorOperators.LSHR, n)
>
On Tue, 27 Jul 2021 00:24:52 GMT, Sandhya Viswanathan
wrote:
>> Jatin Bhateja has updated the pull request with a new target base due to a
>> merge or a rebase. The pull request now contains 19 commits:
>>
>> - 8266054: Re-designing benchmark to remove noise.
&g
On Tue, 27 Jul 2021 02:52:13 GMT, Eric Liu wrote:
>> @sviswa7, SLP flow will either have a constant 8bit shift value or a
>> variable shift present in vector, this also include broadcasted non-constant
>> shift value or a shift value beyond 8 bit.
>
> It would be better comment here, since the
On Tue, 27 Jul 2021 01:54:01 GMT, Sandhya Viswanathan
wrote:
>> Jatin Bhateja has updated the pull request with a new target base due to a
>> merge or a rebase. The pull request now contains 19 commits:
>>
>> - 8266054: Re-designing benchmark to remove noise.
&g
On Mon, 26 Jul 2021 17:19:07 GMT, Sandhya Viswanathan
wrote:
>> And'ing with shift_mask is already done on Java API side implementation
>> before making a call to intrinsic rountine.
>
> @jatin-bhateja This question is still pending.
@sviswa7, SLP flow will either have a c
On Mon, 26 Jul 2021 17:19:07 GMT, Sandhya Viswanathan
wrote:
>> And'ing with shift_mask is already done on Java API side implementation
>> before making a call to intrinsic rountine.
>
> @jatin-bhateja This question is still pending.
Other than VectorAPI , SLP also inf
66.01 |
> -1.33 | 21140.67 | 21970.03 | 3.92
> RotateBenchmark.testRotateRightS | 256.00 | 31.00 | 11676.46 | 11358.64 |
> -2.72 | 11204.90 | 11213.48 | 0.08
> RotateBenchmark.testRotateRightS | 256.00 | 31.00 | 5728.20 | 5772.49 | 0.77
> | 5594.33 | 5544.25 | -0.90
> Ro
On Sun, 18 Jul 2021 20:28:34 GMT, Jatin Bhateja wrote:
>> Current VectorAPI Java side implementation expresses rotateLeft and
>> rotateRight operation using following operations:-
>>
>> vec1 = lanewise(VectorOperators.LSHL, n)
>> vec2 = lanewise(Vecto
On Fri, 16 Jul 2021 00:52:21 GMT, Sandhya Viswanathan
wrote:
>> Jatin Bhateja has updated the pull request with a new target base due to a
>> merge or a rebase. The pull request now contains 15 commits:
>>
>> - 8266054: Incorporating styling changes based on rev
66.01 |
> -1.33 | 21140.67 | 21970.03 | 3.92
> RotateBenchmark.testRotateRightS | 256.00 | 31.00 | 11676.46 | 11358.64 |
> -2.72 | 11204.90 | 11213.48 | 0.08
> RotateBenchmark.testRotateRightS | 256.00 | 31.00 | 5728.20 | 5772.49 | 0.77
> | 5594.33 | 5544.25 | -0.90
> RotateBenchm
66.01 |
> -1.33 | 21140.67 | 21970.03 | 3.92
> RotateBenchmark.testRotateRightS | 256.00 | 31.00 | 11676.46 | 11358.64 |
> -2.72 | 11204.90 | 11213.48 | 0.08
> RotateBenchmark.testRotateRightS | 256.00 | 31.00 | 5728.20 | 5772.49 | 0.77
> | 5594.33 | 5544.25 | -0.90
> RotateBen
66.01 |
> -1.33 | 21140.67 | 21970.03 | 3.92
> RotateBenchmark.testRotateRightS | 256.00 | 31.00 | 11676.46 | 11358.64 |
> -2.72 | 11204.90 | 11213.48 | 0.08
> RotateBenchmark.testRotateRightS | 256.00 | 31.00 | 5728.20 | 5772.49 | 0.77
> | 5594.33 | 5544.25 | -0.90
> Rotate
1 - 100 of 126 matches
Mail list logo