:
> [http://cr.openjdk.java.net/~jbhateja/8252847/JMH_results/ArrayCopy_AVX3_Stubs_WithOpts.txt]()
Jatin Bhateja has updated the pull request incrementally with one additional
commit since the last revision:
8252847 : Review comments resolution
-
Changes:
- all: https://git.openjdk.java.net/jdk/pul
On Tue, 22 Sep 2020 16:39:15 GMT, Jatin Bhateja wrote:
> @jatin-bhateja Can you put summary of performance improvement into JBS?
Hi @vnkozlov , @neliasso
Kindly let me know your feedback, If there are no more comments is it ok to
integrate this patch.
-
PR: ht
:
> [http://cr.openjdk.java.net/~jbhateja/8252847/JMH_results/ArrayCopy_AVX3_Stubs_WithOpts.txt]()
Jatin Bhateja has updated the pull request incrementally with one additional
commit since the last revision:
8252847: Review comments resolution; code reorganized to cover arraycopy for
reference types.
-
:
> [http://cr.openjdk.java.net/~jbhateja/8252847/JMH_results/ArrayCopy_AVX3_Stubs_WithOpts.txt]()
Jatin Bhateja has updated the pull request incrementally with one additional
commit since the last revision:
8252847 : Modifying file permission to resolve jcheck failure.
-
Changes:
-
On Tue, 22 Sep 2020 03:34:37 GMT, Vladimir Kozlov wrote:
> @jatin-bhateja Can you put summary of performance improvement into JBS?
yes, I have added the summary in JBS.
-
PR: https://git.openjdk.java.net/jdk/pull/61
On Tue, 22 Sep 2020 03:34:37 GMT, Vladimir Kozlov wrote:
> @jatin-bhateja Can you put summary of performance improvement into JBS?
Yes, I have added the summary to JBS
-
PR: https://git.openjdk.java.net/jdk/pull/61
On Tue, 8 Jun 2021 00:30:38 GMT, Scott Gibbons
wrote:
>> Add the Base64 Decode intrinsic for x86 to utilize AVX-512 for acceleration.
>> Also allows for performance improvement for non-AVX-512 enabled platforms.
>> Due to the nature of MIME-encoded inputs, modify the intrinsic signature to
On Tue, 8 Jun 2021 10:29:44 GMT, Jatin Bhateja wrote:
>> Current VectorAPI Java side implementation expresses rotateLeft and
>> rotateRight operation using following operations:-
>>
>> vec1 = lanewise(VectorOperators.LSHL, n)
>> vec2 = lanewise(Vecto
66.01 |
> -1.33 | 21140.67 | 21970.03 | 3.92
> RotateBenchmark.testRotateRightS | 256.00 | 31.00 | 11676.46 | 11358.64 |
> -2.72 | 11204.90 | 11213.48 | 0.08
> RotateBenchmark.testRotateRightS | 256.00 | 31.00 | 5728.20 | 5772.49 | 0.77
> | 5594.33 | 5544.25 | -0.90
> Rotate
On Tue, 8 Jun 2021 10:29:44 GMT, Jatin Bhateja wrote:
>> Current VectorAPI Java side implementation expresses rotateLeft and
>> rotateRight operation using following operations:-
>>
>> vec1 = lanewise(VectorOperators.LSHL, n)
>> vec2 = lanewise(Vecto
On Tue, 8 Jun 2021 13:25:00 GMT, Scott Gibbons
wrote:
>> src/hotspot/cpu/x86/stubGenerator_x86_64.cpp line 6239:
>>
>>> 6237:
>>> 6238: __ align(32);
>>> 6239: __ BIND(L_bruteForce);
>>
>> Is this alignment needed ? Given that brute force loop is already aligned.
>
> I must be
On Sat, 8 May 2021 15:40:53 GMT, Paul Sandoz wrote:
> Looks good. Someone from the HotSpot side needs to review related changes.
>
> The way i read the perf numbers is that on non AVX512 systems the numbers are
> in the noise (no worse, no better), with significant improvement on AVX512.
Hi
On Wed, 19 May 2021 08:20:13 GMT, Jatin Bhateja wrote:
> Relevant declarations modified and tested with -Werror, no longer see
> unchecked conversion warnings.
>
> Kindly review and approve.
This pull request has now been integrated.
Changeset: 88b11423
Author:Jatin
Relevant declarations modified and tested with -Werror, no longer see unchecked
conversion warnings.
Kindly review and approve.
-
Commit messages:
- 8267357: build breaks with -Werror option on micro benchmark added for
JDK-8256973
Changes:
66.01 |
> -1.33 | 21140.67 | 21970.03 | 3.92
> RotateBenchmark.testRotateRightS | 256.00 | 31.00 | 11676.46 | 11358.64 |
> -2.72 | 11204.90 | 11213.48 | 0.08
> RotateBenchmark.testRotateRightS | 256.00 | 31.00 | 5728.20 | 5772.49 | 0.77
> | 5594.33 | 5544.25 | -0.90
> RotateBenc
On Fri, 7 May 2021 18:31:15 GMT, Jatin Bhateja wrote:
>> Current VectorAPI Java side implementation expresses rotateLeft and
>> rotateRight operation using following operations:-
>>
>> vec1 = lanewise(VectorOperators.LSHL, n)
>> vec2 = lanewise(Vecto
2.78
> RotateBenchmark.testRotateRightL | 256 | 11 | 8183.789 | 8193.087 | 0.11
> RotateBenchmark.testRotateRightL | 64 | 21 | 4092.686 | 4193.712 | 2.47
> RotateBenchmark.testRotateRightL | 128 | 21 | 2036.854 | 2038.927 | 0.10
> RotateBenchmark.testRotateRightL | 256 | 21 | 8155.01
On Thu, 29 Apr 2021 21:13:38 GMT, Paul Sandoz wrote:
> This PR contains API and implementation changes for [JEP-414 Vector API
> (Second Incubator)](https://openjdk.java.net/jeps/414), in preparation for
> when targeted.
>
> Enhancements are made to the API for the support of operations on
On Mon, 17 May 2021 12:06:33 GMT, Jatin Bhateja wrote:
>> Current VectorAPI Java side implementation expresses rotateLeft and
>> rotateRight operation using following operations:-
>>
>> vec1 = lanewise(VectorOperators.LSHL, n)
>> vec2 = lanewise(Vecto
66.01 |
> -1.33 | 21140.67 | 21970.03 | 3.92
> RotateBenchmark.testRotateRightS | 256.00 | 31.00 | 11676.46 | 11358.64 |
> -2.72 | 11204.90 | 11213.48 | 0.08
> RotateBenchmark.testRotateRightS | 256.00 | 31.00 | 5728.20 | 5772.49 | 0.77
> | 5594.33 | 5544.25 | -0.90
> RotateBenchm
66.01 |
> -1.33 | 21140.67 | 21970.03 | 3.92
> RotateBenchmark.testRotateRightS | 256.00 | 31.00 | 11676.46 | 11358.64 |
> -2.72 | 11204.90 | 11213.48 | 0.08
> RotateBenchmark.testRotateRightS | 256.00 | 31.00 | 5728.20 | 5772.49 | 0.77
> | 5594.33 | 5544.25 | -0.90
> Rotate
On Mon, 24 May 2021 05:50:44 GMT, Jatin Bhateja wrote:
>> Current VectorAPI Java side implementation expresses rotateLeft and
>> rotateRight operation using following operations:-
>>
>> vec1 = lanewise(VectorOperators.LSHL, n)
>> vec2 = lanewise(Vecto
66.01 |
> -1.33 | 21140.67 | 21970.03 | 3.92
> RotateBenchmark.testRotateRightS | 256.00 | 31.00 | 11676.46 | 11358.64 |
> -2.72 | 11204.90 | 11213.48 | 0.08
> RotateBenchmark.testRotateRightS | 256.00 | 31.00 | 5728.20 | 5772.49 | 0.77
> | 5594.33 | 5544.25 | -0.90
> Rotate
Current VectorAPI Java side implementation expresses rotateLeft and rotateRight
operation using following operations:-
vec1 = lanewise(VectorOperators.LSHL, n)
vec2 = lanewise(VectorOperators.LSHR, n)
res = lanewise(VectorOperations.OR, vec1 , vec2)
This patch moves above handling
2.78
> RotateBenchmark.testRotateRightL | 256 | 11 | 8183.789 | 8193.087 | 0.11
> RotateBenchmark.testRotateRightL | 64 | 21 | 4092.686 | 4193.712 | 2.47
> RotateBenchmark.testRotateRightL | 128 | 21 | 2036.854 | 2038.927 | 0.10
> RotateBenchmark.testRotateRightL | 256 | 21 | 8155.01
On Tue, 27 Apr 2021 18:43:11 GMT, Paul Sandoz wrote:
>> Jatin Bhateja has updated the pull request incrementally with one additional
>> commit since the last revision:
>>
>> 8266054: Review comments resolution.
>
> I noticed the tests are only updated for int
On Fri, 30 Apr 2021 15:44:41 GMT, Paul Sandoz wrote:
>> Jatin Bhateja has updated the pull request with a new target base due to a
>> merge or a rebase. The incremental webrev excludes the unrelated changes
>> brought in by the merge/rebase. The pull request contai
66.01 |
> -1.33 | 21140.67 | 21970.03 | 3.92
> RotateBenchmark.testRotateRightS | 256.00 | 31.00 | 11676.46 | 11358.64 |
> -2.72 | 11204.90 | 11213.48 | 0.08
> RotateBenchmark.testRotateRightS | 256.00 | 31.00 | 5728.20 | 5772.49 | 0.77
> | 5594.33 | 5544.25 | -0.90
> RotateBen
cOpts.partiallyMaskedLogicOperationsLong256
> | 1024 | 996.906 | 1013.649 | 1.016794964
> o.o.b.jdk.incubator.vector.MaskedLogicOpts.partiallyMaskedLogicOperationsLong512
> | 256 | 2045.594 | 2048.966 | 1.001648421
> o.o.b.jdk.incubator.vector.MaskedLogicOpts.partiallyMaskedLogicOper
On Tue, 4 Jan 2022 15:11:47 GMT, Jatin Bhateja wrote:
>> Patch extends existing macrologic inferencing algorithm to handle masked
>> logic operations.
>>
>> Existing algorithm:
>>
>> 1. Identify logic cone roots.
>> 2. Packs parent and logic child
On Tue, 4 Jan 2022 02:25:36 GMT, Vladimir Kozlov wrote:
> I think whole "Bitwise operation packing optimization" code should be moved
> out from `compile.cpp`. May be to `vectornode.cpp where `MacroLogicVNode`
> code is located.
>
Hi @vnkozlov ,
Yes we can also extended
cOpts.partiallyMaskedLogicOperationsLong256
> | 1024 | 996.906 | 1013.649 | 1.016794964
> o.o.b.jdk.incubator.vector.MaskedLogicOpts.partiallyMaskedLogicOperationsLong512
> | 256 | 2045.594 | 2048.966 | 1.001648421
> o.o.b.jdk.incubator.vector.MaskedLogicOpts.partiallyMasked
Patch extends existing macrologic inferencing algorithm to handle masked logic
operations.
Existing algorithm:
1. Identify logic cone roots.
2. Packs parent and logic child nodes into a MacroLogic node in bottom up
traversal if input constraint are met.
i.e. maximum number of inputs which a
66.01 |
> -1.33 | 21140.67 | 21970.03 | 3.92
> RotateBenchmark.testRotateRightS | 256.00 | 31.00 | 11676.46 | 11358.64 |
> -2.72 | 11204.90 | 11213.48 | 0.08
> RotateBenchmark.testRotateRightS | 256.00 | 31.00 | 5728.20 | 5772.49 | 0.77
> | 5594.33 | 5544.25 | -0.90
> Ro
On Sun, 18 Jul 2021 20:28:34 GMT, Jatin Bhateja wrote:
>> Current VectorAPI Java side implementation expresses rotateLeft and
>> rotateRight operation using following operations:-
>>
>> vec1 = lanewise(VectorOperators.LSHL, n)
>> vec2 = lanewise(Vecto
66.01 |
> -1.33 | 21140.67 | 21970.03 | 3.92
> RotateBenchmark.testRotateRightS | 256.00 | 31.00 | 11676.46 | 11358.64 |
> -2.72 | 11204.90 | 11213.48 | 0.08
> RotateBenchmark.testRotateRightS | 256.00 | 31.00 | 5728.20 | 5772.49 | 0.77
> | 5594.33 | 5544.25 | -0.90
> RotateBenchm
On Fri, 16 Jul 2021 00:52:21 GMT, Sandhya Viswanathan
wrote:
>> Jatin Bhateja has updated the pull request with a new target base due to a
>> merge or a rebase. The pull request now contains 15 commits:
>>
>> - 8266054: Incorporating styling changes based on rev
On Tue, 27 Jul 2021 02:52:13 GMT, Eric Liu wrote:
>> @sviswa7, SLP flow will either have a constant 8bit shift value or a
>> variable shift present in vector, this also include broadcasted non-constant
>> shift value or a shift value beyond 8 bit.
>
> It would be better comment here, since the
On Tue, 27 Jul 2021 01:54:01 GMT, Sandhya Viswanathan
wrote:
>> Jatin Bhateja has updated the pull request with a new target base due to a
>> merge or a rebase. The pull request now contains 19 commits:
>>
>> - 8266054: Re-designing benchmark to remove noise.
&g
On Tue, 27 Apr 2021 17:56:04 GMT, Jatin Bhateja wrote:
> Current VectorAPI Java side implementation expresses rotateLeft and
> rotateRight operation using following operations:-
>
> vec1 = lanewise(VectorOperators.LSHL, n)
> vec2 = lanewise(VectorOperators.LSHR, n)
>
On Tue, 27 Jul 2021 00:24:52 GMT, Sandhya Viswanathan
wrote:
>> Jatin Bhateja has updated the pull request with a new target base due to a
>> merge or a rebase. The pull request now contains 19 commits:
>>
>> - 8266054: Re-designing benchmark to remove noise.
&g
On Wed, 28 Jul 2021 05:35:59 GMT, Vladimir Kozlov wrote:
> Backout the following changes due to vector tests failures in tier 2 and
> later:
> [JDK-8266054](https://bugs.openjdk.java.net/browse/JDK-8266054) VectorAPI
> rotate operation optimization
>
> Changes also caused copyright header
On Mon, 26 Jul 2021 17:19:07 GMT, Sandhya Viswanathan
wrote:
>> And'ing with shift_mask is already done on Java API side implementation
>> before making a call to intrinsic rountine.
>
> @jatin-bhateja This question is still pending.
Other than VectorAPI , SLP also inf
On Mon, 26 Jul 2021 17:19:07 GMT, Sandhya Viswanathan
wrote:
>> And'ing with shift_mask is already done on Java API side implementation
>> before making a call to intrinsic rountine.
>
> @jatin-bhateja This question is still pending.
@sviswa7, SLP flow will either have a c
On Tue, 4 Jan 2022 02:21:35 GMT, Vladimir Kozlov wrote:
>> Jatin Bhateja has updated the pull request with a new target base due to a
>> merge or a rebase. The incremental webrev excludes the unrelated changes
>> brought in by the merge/rebase. The pull request conta
cOpts.partiallyMaskedLogicOperationsLong256
> | 1024 | 996.906 | 1013.649 | 1.016794964
> o.o.b.jdk.incubator.vector.MaskedLogicOpts.partiallyMaskedLogicOperationsLong512
> | 256 | 2045.594 | 2048.966 | 1.001648421
> o.o.b.jdk.incubator.vector.MaskedLogicOpts.partiallyMaske
On Sun, 16 Jan 2022 02:23:15 GMT, Quan Anh Mai wrote:
> Hi, did we have tests for the scalar intrinsification already? Thanks.
Verification is done against scalar rounding operation.
On Sun, 13 Feb 2022 10:58:19 GMT, Andrew Haley wrote:
>> Jatin Bhateja has updated the pull request with a new target base due to a
>> merge or a rebase. The incremental webrev excludes the unrelated changes
>> brought in by the merge/rebase. The pull request contai
On Sun, 13 Feb 2022 13:08:41 GMT, Jatin Bhateja wrote:
>> src/hotspot/cpu/x86/c2_MacroAssembler_x86.cpp line 4066:
>>
>>> 4064: }
>>> 4065:
>>> 4066: void
>>> C2_MacroAssembler::vector_cast_double_special_cases_evex(XMMRegister dst,
>>
On Sat, 5 Feb 2022 15:34:08 GMT, Quan Anh Mai wrote:
> Hi,
>
> This patch implements the unsigned upcast intrinsics in x86, which are used
> in vector lane-wise reinterpreting operations.
>
> Thank you very much.
src/hotspot/cpu/x86/x86.ad line 7288:
> 7286: break;
> 7287:
On Fri, 4 Mar 2022 06:06:52 GMT, Joe Darcy wrote:
>> test/jdk/java/lang/Math/RoundTests.java line 32:
>>
>>> 30: public static void main(String... args) {
>>> 31: int failures = 0;
>>> 32: for (int i = 0; i < 10; i++) {
>>
>> Is there an idiom to trigger the
On Sun, 6 Mar 2022 09:31:27 GMT, Andrew Haley wrote:
>> Jatin Bhateja has updated the pull request incrementally with one additional
>> commit since the last revision:
>>
>> 8279508: Removing +LogCompilation flag.
>
> src/hotspot/cpu/x86/c2_MacroAssembler
und_float | 1024.00 | 825.99 | 4754.66 | 5.76 |
> 751.83 | 2274.13 | 3.02
> FpRoundingBenchmark.test_round_float | 2048.00 | 412.22 | 2490.09 | 6.04 |
> 388.52 | 1334.18 | 3.43
>
>
> Kindly review and share your feedback.
>
> Best Regards,
> Jatin
Jatin Bhateja h
On Thu, 10 Mar 2022 14:29:36 GMT, Joe Darcy wrote:
>> Hi @jddarcy ,
>>
>> Test has been modified on the same lines using generic options which
>> manipulate compilation thresholds and agnostic to target platforms.
>>
>> * @run main/othervm -XX:Tier3CompileThreshold=100
>>
On Mon, 14 Mar 2022 09:29:28 GMT, Andrew Haley wrote:
>> Good suggestion, but as of now we are not using vector calling conventions
>> for stubs.
>
> I don't understand this comment. If the stub is only to be used by you, then
> you can determine your own calling convention.
We are passing
und_float | 1024.00 | 825.99 | 4754.66 | 5.76 |
> 751.83 | 2274.13 | 3.02
> FpRoundingBenchmark.test_round_float | 2048.00 | 412.22 | 2490.09 | 6.04 |
> 388.52 | 1334.18 | 3.43
>
>
> Kindly review and share your feedback.
>
> Best Regards,
> Jatin
Jatin Bhateja has u
On Mon, 14 Mar 2022 10:35:58 GMT, Tobias Hartmann wrote:
>> Jatin Bhateja has updated the pull request incrementally with one additional
>> commit since the last revision:
>>
>> 8279508: Windows build failure fix.
>
> `compiler/c2/cr6340864/TestFloatVect.java`
und_float | 1024.00 | 825.99 | 4754.66 | 5.76 |
> 751.83 | 2274.13 | 3.02
> FpRoundingBenchmark.test_round_float | 2048.00 | 412.22 | 2490.09 | 6.04 |
> 388.52 | 1334.18 | 3.43
>
>
> Kindly review and share your feedback.
>
> Best Regards,
> Jatin
Jatin Bhateja has u
On Sat, 12 Mar 2022 23:20:58 GMT, Quan Anh Mai wrote:
>> Jatin Bhateja has updated the pull request incrementally with one additional
>> commit since the last revision:
>>
>> 8279508: Creating separate test for round double under feature check.
>
> src/hotspot
On Sun, 13 Mar 2022 00:06:07 GMT, Quan Anh Mai wrote:
>> src/hotspot/cpu/x86/c2_MacroAssembler_x86.cpp line 4161:
>>
>>> 4159: movl(scratch, 1056964608);
>>> 4160: movq(xtmp1, scratch);
>>> 4161: vbroadcastss(xtmp1, xtmp1, vec_enc);
>>
>> An `evpbroadcastd` would reduce this by one
und_float | 1024.00 | 825.99 | 4754.66 | 5.76 |
> 751.83 | 2274.13 | 3.02
> FpRoundingBenchmark.test_round_float | 2048.00 | 412.22 | 2490.09 | 6.04 |
> 388.52 | 1334.18 | 3.43
>
>
> Kindly review and share your feedback.
>
> Best Regards,
> Jatin
Jatin Bhateja h
und_float | 1024.00 | 825.99 | 4754.66 | 5.76 |
> 751.83 | 2274.13 | 3.02
> FpRoundingBenchmark.test_round_float | 2048.00 | 412.22 | 2490.09 | 6.04 |
> 388.52 | 1334.18 | 3.43
>
>
> Kindly review and share your feedback.
>
> Best Regards,
> Jatin
Jatin Bhateja h
und_float | 1024.00 | 825.99 | 4754.66 | 5.76 |
> 751.83 | 2274.13 | 3.02
> FpRoundingBenchmark.test_round_float | 2048.00 | 412.22 | 2490.09 | 6.04 |
> 388.52 | 1334.18 | 3.43
>
>
> Kindly review and share your feedback.
>
> Best Regards,
> Jatin
Jatin Bhateja h
und_float | 1024.00 | 825.99 | 4754.66 | 5.76 |
> 751.83 | 2274.13 | 3.02
> FpRoundingBenchmark.test_round_float | 2048.00 | 412.22 | 2490.09 | 6.04 |
> 388.52 | 1334.18 | 3.43
>
>
> Kindly review and share your feedback.
>
> Best Regards,
> Jatin
Jatin Bhateja h
und_float | 1024.00 | 825.69 | 3592.54 | 4.35 |
> 825.32 | 1836.42 | 2.23
> FpRoundingBenchmark.test_round_float | 2048.00 | 388.55 | 1895.77 | 4.88 |
> 412.31 | 945.82 | 2.29
>
>
> Kindly review and share your feedback.
>
> Best Regards,
> Jatin
Jatin Bhateja h
On Fri, 21 Jan 2022 00:49:04 GMT, Sandhya Viswanathan
wrote:
> The JVM currently initializes the x86 mxcsr to round to nearest even, see
> below in stubGenerator_x86_64.cpp: // Round to nearest (even), 64-bit mode,
> exceptions masked StubRoutines::x86::_mxcsr_std = 0x1F80; The above works
und_float | 1024.00 | 825.69 | 3592.54 | 4.35 |
> 825.32 | 1836.42 | 2.23
> FpRoundingBenchmark.test_round_float | 2048.00 | 388.55 | 1895.77 | 4.88 |
> 412.31 | 945.82 | 2.29
>
>
> Kindly review and share your feedback.
>
> Best Regards,
> Jatin
Jatin Bha
> Kindly review and share your feedback.
>
> Best Regards,
> Jatin
Jatin Bhateja has updated the pull request incrementally with one additional
commit since the last revision:
8279508: Fixing for windows failure.
-
Changes:
- all: https://git.openjdk.java.net/jdk/p
On Thu, 24 Feb 2022 00:43:27 GMT, Sandhya Viswanathan
wrote:
> Also curious, how does the performance look with all these changes.
Updated new perf numbers.
-
PR: https://git.openjdk.java.net/jdk/pull/7094
On Thu, 24 Feb 2022 02:43:46 GMT, Vamsi Parasa wrote:
>> Optimizes the divideUnsigned() and remainderUnsigned() methods in
>> java.lang.Integer and java.lang.Long classes using x86 intrinsics. This
>> change shows 3x improvement for Integer methods and upto 25% improvement for
>> Long. This
> Kindly review and share your feedback.
>
> Best Regards,
> Jatin
Jatin Bhateja has updated the pull request incrementally with one additional
commit since the last revision:
8279508: Review comments resolved.
-
Changes:
- all: https://git.openjdk.java.net/jdk/p
On Thu, 24 Feb 2022 01:43:27 GMT, Sandhya Viswanathan
wrote:
>> Jatin Bhateja has updated the pull request incrementally with one additional
>> commit since the last revision:
>>
>> 8279508: Review comments resolved.
>
> src/hotspot/cpu/x86/macroAssembler
und_float | 1024.00 | 825.99 | 4754.66 | 5.76 |
> 751.83 | 2274.13 | 3.02
> FpRoundingBenchmark.test_round_float | 2048.00 | 412.22 | 2490.09 | 6.04 |
> 388.52 | 1334.18 | 3.43
>
>
> Kindly review and share your feedback.
>
> Best Regards,
> Jatin
Jatin Bhateja h
On Mon, 14 Feb 2022 17:14:10 GMT, Jatin Bhateja wrote:
>> That pseudocode would make a very useful comment too. This whole patch is
>> very thinly commented.
>
>> > Hi, IIRC for evex encoding you can embed the RC control bit directly in
>> > the evex prefix, re
On Wed, 16 Feb 2022 12:26:45 GMT, Jatin Bhateja wrote:
>>> > Hi, IIRC for evex encoding you can embed the RC control bit directly in
>>> > the evex prefix, removing the need to rely on global MXCSR register.
>>> > Thanks.
>>>
>>>
und_float | 1024.00 | 825.69 | 3592.54 | 4.35 |
> 825.32 | 1836.42 | 2.23
> FpRoundingBenchmark.test_round_float | 2048.00 | 388.55 | 1895.77 | 4.88 |
> 412.31 | 945.82 | 2.29
>
>
> Kindly review and share your feedback.
>
> Best Regards,
> Jatin
Jatin Bhateja has u
On Wed, 16 Feb 2022 12:30:27 GMT, Jatin Bhateja wrote:
>> Summary of changes:
>> - Intrinsify Math.round(float) and Math.round(double) APIs.
>> - Extend auto-vectorizer to infer vector operations on encountering scalar
>> IR nodes for above intrinsics.
>> - Test
On Fri, 25 Feb 2022 06:22:42 GMT, Jatin Bhateja wrote:
>> Summary of changes:
>> - Intrinsify Math.round(float) and Math.round(double) APIs.
>> - Extend auto-vectorizer to infer vector operations on encountering scalar
>> IR nodes for above intrinsics.
>> - Test
> Kindly review and share your feedback.
>
> Best Regards,
> Jatin
Jatin Bhateja has updated the pull request incrementally with one additional
commit since the last revision:
8279508: Review comments resolved.
-
Changes:
- all: https://git.openjdk.java.net/jdk/p
On Wed, 23 Feb 2022 01:31:24 GMT, Sandhya Viswanathan
wrote:
>> Jatin Bhateja has updated the pull request incrementally with one additional
>> commit since the last revision:
>>
>> 8279508: Fixing for windows failure.
>
> src/hotspot/cpu/x86/c2_MacroAssembler
On Tue, 22 Feb 2022 09:24:47 GMT, Vamsi Parasa wrote:
> Optimizes the divideUnsigned() and remainderUnsigned() methods in
> java.lang.Integer and java.lang.Long classes using x86 intrinsics. This
> change shows 3x improvement for Integer methods and upto 25% improvement for
> Long. This
und_float | 1024.00 | 825.99 | 4754.66 | 5.76 |
> 751.83 | 2274.13 | 3.02
> FpRoundingBenchmark.test_round_float | 2048.00 | 412.22 | 2490.09 | 6.04 |
> 388.52 | 1334.18 | 3.43
>
>
> Kindly review and share your feedback.
>
> Best Regards,
> Jatin
Jatin Bhateja h
und_float | 1024.00 | 825.99 | 4754.66 | 5.76 |
> 751.83 | 2274.13 | 3.02
> FpRoundingBenchmark.test_round_float | 2048.00 | 412.22 | 2490.09 | 6.04 |
> 388.52 | 1334.18 | 3.43
>
>
> Kindly review and share your feedback.
>
> Best Regards,
> Jatin
Jatin Bhateja h
On Wed, 19 Jan 2022 22:09:26 GMT, Joe Darcy wrote:
>> Jatin Bhateja has updated the pull request incrementally with one additional
>> commit since the last revision:
>>
>> 8279508: Adding a test for scalar intrinsification.
>
> The testing for this PR doesn't lo
On Mon, 14 Feb 2022 09:12:54 GMT, Andrew Haley wrote:
>>> What does this do? Comment, even pseudo code, would be nice.
>>
>> Thanks @theRealAph , I shall append the comments over the routine.
>> BTW, entire rounding algorithm can also be implemented using Vector API
>> which can perform
On Mon, 21 Mar 2022 17:56:22 GMT, Quan Anh Mai wrote:
>> constant and register to register moves are never issued to execution ports,
>> rematerializing value rather than reading from memory will give better
>> performance.
>
> I have come across this a little bit. While `movl r, i` may not
On Wed, 23 Mar 2022 06:55:50 GMT, Tobias Hartmann wrote:
>> Jatin Bhateja has updated the pull request with a new target base due to a
>> merge or a rebase. The pull request now contains 22 commits:
>>
>> - 8279508: Using an explicit scratch register since rscra
On Tue, 22 Mar 2022 01:55:38 GMT, Quan Anh Mai wrote:
>> A read from constant table will incur minimum of L1I access penalty to
>> access code blob or at worst even more if data is not present in first level
>> cache. Change was done for replace vpbroadcastd with vbroadcastss because of
>>
On Sun, 27 Mar 2022 06:15:34 GMT, Vamsi Parasa wrote:
> Implements x86 intrinsics for compare() method in java.lang.Integer and
> java.lang.Long.
src/hotspot/cpu/x86/x86_64.ad line 12107:
> 12105: instruct compareSignedI_rReg(rRegI dst, rRegI op1, rRegI op2, rRegI
> tmp, rFlagsReg cr)
>
Summary of changes:
- Intrinsify Math.round(float) and Math.round(double) APIs.
- Extend auto-vectorizer to infer vector operations on encountering scalar IR
nodes for above intrinsics.
- Test creation using new IR testing framework.
Following are the performance number of a JMH micro included
On Thu, 6 Jan 2022 17:39:20 GMT, Sandhya Viswanathan
wrote:
>> Jatin Bhateja has updated the pull request incrementally with one additional
>> commit since the last revision:
>>
>> 8273322: Review comments resolution.
>
> test/hotspot/jtreg/compiler/vectorapi/T
cOpts.partiallyMaskedLogicOperationsLong256
> | 1024 | 996.906 | 1013.649 | 1.016794964
> o.o.b.jdk.incubator.vector.MaskedLogicOpts.partiallyMaskedLogicOperationsLong512
> | 256 | 2045.594 | 2048.966 | 1.001648421
> o.o.b.jdk.incubator.vector.MaskedLogicOpts.partiallyMaskedLog
On Mon, 20 Dec 2021 13:33:01 GMT, Jatin Bhateja wrote:
> Patch extends existing macrologic inferencing algorithm to handle masked
> logic operations.
>
> Existing algorithm:
>
> 1. Identify logic cone roots.
> 2. Packs parent and logic child nodes into a MacroLo
3 | 12.48279907
>
> Kindly review and share your feedback.
>
> Best Regards,
> Jatin
Jatin Bhateja has updated the pull request incrementally with one additional
commit since the last revision:
8279508: Adding a test for scalar intrinsification.
On Thu, 31 Mar 2022 03:53:15 GMT, Xiaohong Gong wrote:
>> Yeah, maybe I misunderstood what you mean. So maybe the masked store
>> `(store(src, m))` could be implemented with:
>>
>> 1) v1 = load
>> 2) v2 = blend(load, src, m)
>> 3) store(v2)
>>
>> Let's record this a JBS and fix it with a
On Wed, 6 Apr 2022 06:02:07 GMT, Vamsi Parasa wrote:
>> Optimizes the divideUnsigned() and remainderUnsigned() methods in
>> java.lang.Integer and java.lang.Long classes using x86 intrinsics. This
>> change shows 3x improvement for Integer methods and upto 25% improvement for
>> Long. This
On Mon, 4 Apr 2022 07:24:12 GMT, Vamsi Parasa wrote:
>> Also need a jtreg test for this.
>
>> Also need a jtreg test for this.
>
> Thanks Sandhya for the review. Made the suggested changes and added jtreg
> tests as well.
Hi @vamsi-parasa , thanks for addressing my comments, looks good to me
On Sun, 17 Apr 2022 14:35:14 GMT, Jie Fu wrote:
>> According to the Vector API doc, the LSHR operator computes
>> a>>>(n&(ESIZE*8-1))
Documentation is correct if viewed strictly in context of subword vector lane,
JVM internally promotes/sign extends subword type scalar variables into int
over AARCH64 and X86 targets different AVX levels.
>
> Kindly review and share your feedback.
>
> Best Regards,
> Jatin
Jatin Bhateja has updated the pull request with a new target base due to a
merge or a rebase. The pull request now contains 13 commits:
- Merge branch
over AARCH64 and X86 targets different AVX levels.
>
> Kindly review and share your feedback.
>
> Best Regards,
> Jatin
Jatin Bhateja has updated the pull request incrementally with one additional
commit since the last revision:
8284960: Adding --enable-preview
1 - 100 of 126 matches
Mail list logo