Re: RFR: 8223347: Integration of Vector API (Incubator) [v4]

2020-10-13 Thread Sandhya Viswanathan
On Tue, 13 Oct 2020 21:29:52 GMT, Ekaterina Pavlova wrote: >> Build changes look good. > > There are several gc tests crashed in panama-vector tier3 testing which seems > are not observed in openjdk repo. > The crashes look like: > # assert(oopDesc::is_oop(obj)) failed: not an oop: 0xf

RFR: 8255174: Vector API unit tests for missed public api code coverage

2020-10-21 Thread Sandhya Viswanathan
Additional tests to increase Vector API public method code coverage to > 99%. - Commit messages: - 8255174: Vector API unit tests for missed public api code coverage Changes: https://git.openjdk.java.net/jdk/pull/785/files Webrev: https://webrevs.openjdk.java.net/?repo=jdk&pr=785&r

Integrated: 8255174: Vector API unit tests for missed public api code coverage

2020-10-21 Thread Sandhya Viswanathan
On Wed, 21 Oct 2020 20:17:32 GMT, Sandhya Viswanathan wrote: > Additional tests to increase Vector API public method code coverage to > 99%. This pull request has now been integrated. Changeset: 5d262290 Author:Sandhya Viswanathan URL: https://git.openjdk.java.net/jdk/

RFR: 8256585: Remove in-place conversion vector operators from Vector API

2020-11-18 Thread Sandhya Viswanathan
Remove partially implemented in-place conversion vector operators from Vector API: ofNarrowing, ofWidening, INPLACE_XXX - Commit messages: - 8256585: Remove in-place conversion vector operators from Vector API Changes: https://git.openjdk.java.net/jdk/pull/1305/files Webrev: ht

Re: RFR: 8256585: Remove in-place conversion vector operators from Vector API [v2]

2020-11-19 Thread Sandhya Viswanathan
> Remove partially implemented in-place conversion vector operators from Vector > API: >ofNarrowing, ofWidening, INPLACE_XXX Sandhya Viswanathan has updated the pull request incrementally with one additional commit since the last revision: Update documentation -

Re: RFR: 8256585: Remove in-place conversion vector operators from Vector API [v2]

2020-11-19 Thread Sandhya Viswanathan
On Thu, 19 Nov 2020 17:18:12 GMT, Paul Sandoz wrote: >> Sandhya Viswanathan has updated the pull request incrementally with one >> additional commit since the last revision: >> >> Update documentation > > The documentation `Vector.convert` and `Vector.convertSh

Integrated: 8256585: Remove in-place conversion vector operators from Vector API

2020-11-23 Thread Sandhya Viswanathan
On Thu, 19 Nov 2020 03:26:20 GMT, Sandhya Viswanathan wrote: > Remove partially implemented in-place conversion vector operators from Vector > API: >ofNarrowing, ofWidening, INPLACE_XXX This pull request has now been integrated. Changeset: 9de5d091 Author: Sandhya Viswana

[jdk16] RFR: 8259213: Vector conversion with part > 0 is not getting intrinsic implementation

2021-01-04 Thread Sandhya Viswanathan
Vector conversion with part > 0 is implemented using slice(origin, vector) instead of slice(origin). The slice(origin) has intrinsic implementation whereas slice(origin, vector) doesn’t. Slice(origin) is written using vector API methods like rearrange and blend which all have intrinsic implement

Re: [jdk16] RFR: 8259213: Vector conversion with part > 0 is not getting intrinsic implementation [v2]

2021-01-05 Thread Sandhya Viswanathan
> which all have intrinsic implementations. > Also, VectorIntrinsics.VECTOR_ACCESS_OOB_CHECK code is missing from rearrange > checkIndexes. > > Please review this patch which fixes the above issue. > > Best Regards, > Sandhya Sandhya Viswanathan has updated the pull request

Re: [jdk16] RFR: 8259213: Vector conversion with part > 0 is not getting intrinsic implementation [v2]

2021-01-05 Thread Sandhya Viswanathan
On Tue, 5 Jan 2021 15:51:08 GMT, Paul Sandoz wrote: >> Sandhya Viswanathan has updated the pull request incrementally with one >> additional commit since the last revision: >> >> update copyright year > > Looks good. Can you please update the copyright year bef

[jdk16] Integrated: 8259213: Vector conversion with part > 0 is not getting intrinsic implementation

2021-01-05 Thread Sandhya Viswanathan
On Tue, 5 Jan 2021 01:03:55 GMT, Sandhya Viswanathan wrote: > Vector conversion with part > 0 is implemented using slice(origin, vector) > instead of slice(origin). > The slice(origin) has intrinsic implementation whereas slice(origin, vector) > doesn’t. > Slice(origin

RFR: 8262989: Vectorize VectorShuffle checkIndexes, wrapIndexes and laneIsValid methods

2021-03-03 Thread Sandhya Viswanathan
The hot path of VectorShuffle checkIndexes, wrapIndexes and laneIsValid methods can be implemented using Vector API methods. For the attached jmh TestSlice.java, performance improves as below. Before: Benchmark (size) Mode Cnt Score Error Units T

Re: RFR: 8262989: Vectorize VectorShuffle checkIndexes, wrapIndexes and laneIsValid methods

2021-03-04 Thread Sandhya Viswanathan
On Thu, 4 Mar 2021 16:47:30 GMT, Paul Sandoz wrote: >> The hot path of VectorShuffle checkIndexes, wrapIndexes and laneIsValid >> methods can be implemented using Vector API methods. >> >> For the attached jmh TestSlice.java, performance improves as below. >> >> Before: >> Benchmark

Integrated: 8262989: Vectorize VectorShuffle checkIndexes, wrapIndexes and laneIsValid methods

2021-03-04 Thread Sandhya Viswanathan
On Wed, 3 Mar 2021 22:32:48 GMT, Sandhya Viswanathan wrote: > The hot path of VectorShuffle checkIndexes, wrapIndexes and laneIsValid > methods can be implemented using Vector API methods. > > For the attached jmh TestSlice.java, performance improves as below. > > Be

RFR: 8265783: Create a separate library for x86 Intel SVML assembly intrinsics

2021-04-22 Thread Sandhya Viswanathan
Intel Short Vector Math Library (SVML) based intrinsics in native x86 assembly provide optimized implementation for Vector API transcendental and trigonometric methods. These methods are built into a separate library instead of being part of libjvm.so or jvm.dll. The following changes are made:

Re: RFR: 8265783: Create a separate library for x86 Intel SVML assembly intrinsics [v2]

2021-04-28 Thread Sandhya Viswanathan
2.40 > Float64Vector.ATAN2 15.95 112.35 ops/ms 7.04 > Float64Vector.CBRT 24.03 134.57 ops/ms 5.60 > Float64Vector.COS 44.28 394.33 ops/ms 8.91 > Float64Vector.COSH 28.35 95.27 ops/ms 3.36 > Float64Vector.EXP 65.80 486.37 ops/ms 7.39 > Float64Vector.EXPM1 34.61 85.99 ops/ms 2.48 > Fl

RFR: 8265128: [REDO] Optimize Vector API slice and unslice operations

2021-04-29 Thread Sandhya Viswanathan
All the slice and unslice variants that take more than one argument can benefit from already intrinsic methods on similar lines as slice(origin) and unslice(origin). Changes include: * Rewrite Vector API slice/unslice using already intrinsic methods * Fix in library_call.cpp:inline_preconditio

Re: RFR: 8265128: [REDO] Optimize Vector API slice and unslice operations

2021-04-29 Thread Sandhya Viswanathan
On Thu, 29 Apr 2021 22:59:13 GMT, Paul Sandoz wrote: >> All the slice and unslice variants that take more than one argument can >> benefit from already intrinsic methods on similar lines as slice(origin) and >> unslice(origin). >> >> Changes include: >> * Rewrite Vector API slice/unslice usin

Re: RFR: 8265128: [REDO] Optimize Vector API slice and unslice operations

2021-04-29 Thread Sandhya Viswanathan
On Thu, 29 Apr 2021 21:29:03 GMT, Sandhya Viswanathan wrote: > All the slice and unslice variants that take more than one argument can > benefit from already intrinsic methods on similar lines as slice(origin) and > unslice(origin). > > Changes include: > * Rewrite Vector

Re: RFR: 8265128: [REDO] Optimize Vector API slice and unslice operations [v2]

2021-04-29 Thread Sandhya Viswanathan
TestSlice.vectorSliceUnsliceOrigin 1024 thrpt 5 6662.159 ± 8.203 ops/ms > TestSlice.vectorSliceUnsliceOriginVector 1024 thrpt 5 5206.300 ± 43.637 ops/ms > TestSlice.vectorSliceUnsliceOriginVectorPart 1024 thrpt 5 5194.278 ± 13.376 > ops/ms Sandhya Viswanathan has updated the pull request incr

Re: RFR: 8265128: [REDO] Optimize Vector API slice and unslice operations [v2]

2021-04-29 Thread Sandhya Viswanathan
On Fri, 30 Apr 2021 00:17:24 GMT, Sandhya Viswanathan wrote: >> All the slice and unslice variants that take more than one argument can >> benefit from already intrinsic methods on similar lines as slice(origin) and >> unslice(origin). >> >> Changes include:

Re: RFR: 8265128: [REDO] Optimize Vector API slice and unslice operations [v3]

2021-04-29 Thread Sandhya Viswanathan
TestSlice.vectorSliceUnsliceOrigin 1024 thrpt 5 6662.159 ± 8.203 ops/ms > TestSlice.vectorSliceUnsliceOriginVector 1024 thrpt 5 5206.300 ± 43.637 ops/ms > TestSlice.vectorSliceUnsliceOriginVectorPart 1024 thrpt 5 5194.278 ± 13.376 > ops/ms Sandhya Viswanathan has updated the pull request incr

Re: RFR: 8265128: [REDO] Optimize Vector API slice and unslice operations [v3]

2021-04-30 Thread Sandhya Viswanathan
On Fri, 30 Apr 2021 02:31:07 GMT, Paul Sandoz wrote: >> Sandhya Viswanathan has updated the pull request incrementally with one >> additional commit since the last revision: >> >> Review comments: blendmask etc > > Marked as reviewed by psandoz (Reviewer). @P

Re: RFR: 8265128: [REDO] Optimize Vector API slice and unslice operations [v3]

2021-04-30 Thread Sandhya Viswanathan
On Fri, 30 Apr 2021 23:34:15 GMT, Paul Sandoz wrote: > > > @PaulSandoz would it be possible for you to run this through your testing? > > > > > > Started, will report back when done. > > Tier 1 to 3 tests all pass on build profiles linux-x64 linux-aarch64 > macosx-x64 windows-x64 @PaulSandoz

Re: RFR: 8265128: [REDO] Optimize Vector API slice and unslice operations [v3]

2021-05-03 Thread Sandhya Viswanathan
On Fri, 30 Apr 2021 01:58:27 GMT, Sandhya Viswanathan wrote: >> All the slice and unslice variants that take more than one argument can >> benefit from already intrinsic methods on similar lines as slice(origin) and >> unslice(origin). >> >> Changes include:

Re: RFR: 8265783: Create a separate library for x86 Intel SVML assembly intrinsics [v2]

2021-05-03 Thread Sandhya Viswanathan
On Mon, 3 May 2021 21:41:26 GMT, Paul Sandoz wrote: >> Sandhya Viswanathan has updated the pull request with a new target base due >> to a merge or a rebase. The pull request now contains six commits: >> >> - Merge master >> - remove whitespace >

Re: RFR: 8265783: Create a separate library for x86 Intel SVML assembly intrinsics [v2]

2021-05-04 Thread Sandhya Viswanathan
On Wed, 28 Apr 2021 21:11:26 GMT, Sandhya Viswanathan wrote: >> This PR contains Short Vector Math Library support related changes for >> [JEP-414 Vector API (Second Incubator)](https://openjdk.java.net/jeps/414), >> in preparation for when targeted. >> >> I

Re: RFR: 8265128: [REDO] Optimize Vector API slice and unslice operations [v4]

2021-05-10 Thread Sandhya Viswanathan
TestSlice.vectorSliceUnsliceOrigin 1024 thrpt 5 6662.159 ± 8.203 ops/ms > TestSlice.vectorSliceUnsliceOriginVector 1024 thrpt 5 5206.300 ± 43.637 ops/ms > TestSlice.vectorSliceUnsliceOriginVectorPart 1024 thrpt 5 5194.278 ± 13.376 > ops/ms Sandhya Viswanathan has updated the pull request incre

Re: RFR: 8265128: [REDO] Optimize Vector API slice and unslice operations [v3]

2021-05-10 Thread Sandhya Viswanathan
On Fri, 30 Apr 2021 23:34:15 GMT, Paul Sandoz wrote: >>> @PaulSandoz would it be possible for you to run this through your testing? >> >> Started, will report back when done. > >> > @PaulSandoz would it be possible for you to run this through your testing? >> >> Started, will report back when d

Re: RFR: 8265128: [REDO] Optimize Vector API slice and unslice operations [v4]

2021-05-10 Thread Sandhya Viswanathan
On Mon, 10 May 2021 18:31:30 GMT, Sandhya Viswanathan wrote: >> All the slice and unslice variants that take more than one argument can >> benefit from already intrinsic methods on similar lines as slice(origin) and >> unslice(origin). >> >> Changes include:

Integrated: 8265128: [REDO] Optimize Vector API slice and unslice operations

2021-05-10 Thread Sandhya Viswanathan
On Thu, 29 Apr 2021 21:29:03 GMT, Sandhya Viswanathan wrote: > All the slice and unslice variants that take more than one argument can > benefit from already intrinsic methods on similar lines as slice(origin) and > unslice(origin). > > Changes include: > * Rewrite Vector

RFR: 8267190: Optimize Vector API test operations

2021-05-14 Thread Sandhya Viswanathan
Vector API test operations (IS_DEFAULT, IS_FINITE, IS_INFINITE, IS_NAN and IS_NEGATIVE) are computed in three steps: 1) reinterpreting the floating point vectors as integral vectors (int/long) 2) perform the test in integer domain to get a int/long mask 3) reinterpret the int/long mask as float/do

Re: RFR: 8265783: Create a separate library for x86 Intel SVML assembly intrinsics [v3]

2021-05-14 Thread Sandhya Viswanathan
s 1.85 > Float64Vector.ASIN 47.30 95.72 ops/ms 2.02 > Float64Vector.ATAN 20.62 49.45 ops/ms 2.40 > Float64Vector.ATAN2 15.95 112.35 ops/ms 7.04 > Float64Vector.CBRT 24.03 134.57 ops/ms 5.60 > Float64Vector.COS 44.28 394.33 ops/ms 8.91 > Float64Vector.COSH 28.35 95.27 ops/ms 3.36 >

Re: RFR: 8265783: Create a separate library for x86 Intel SVML assembly intrinsics [v4]

2021-05-14 Thread Sandhya Viswanathan
s 1.85 > Float64Vector.ASIN 47.30 95.72 ops/ms 2.02 > Float64Vector.ATAN 20.62 49.45 ops/ms 2.40 > Float64Vector.ATAN2 15.95 112.35 ops/ms 7.04 > Float64Vector.CBRT 24.03 134.57 ops/ms 5.60 > Float64Vector.COS 44.28 394.33 ops/ms 8.91 > Float64Vector.COSH 28.35 95.27 ops/ms 3.36 >

Re: RFR: 8265783: Create a separate library for x86 Intel SVML assembly intrinsics [v5]

2021-05-18 Thread Sandhya Viswanathan
s 1.85 > Float64Vector.ASIN 47.30 95.72 ops/ms 2.02 > Float64Vector.ATAN 20.62 49.45 ops/ms 2.40 > Float64Vector.ATAN2 15.95 112.35 ops/ms 7.04 > Float64Vector.CBRT 24.03 134.57 ops/ms 5.60 > Float64Vector.COS 44.28 394.33 ops/ms 8.91 > Float64Vector.COSH 28.35 95.27 ops/ms 3.36 >

Re: RFR: 8265783: Create a separate library for x86 Intel SVML assembly intrinsics [v6]

2021-05-18 Thread Sandhya Viswanathan
s 1.85 > Float64Vector.ASIN 47.30 95.72 ops/ms 2.02 > Float64Vector.ATAN 20.62 49.45 ops/ms 2.40 > Float64Vector.ATAN2 15.95 112.35 ops/ms 7.04 > Float64Vector.CBRT 24.03 134.57 ops/ms 5.60 > Float64Vector.COS 44.28 394.33 ops/ms 8.91 > Float64Vector.COSH 28.35 95.27 ops/ms 3.36 >

Re: RFR: 8265783: Create a separate library for x86 Intel SVML assembly intrinsics [v6]

2021-05-18 Thread Sandhya Viswanathan
On Tue, 18 May 2021 23:43:13 GMT, Sandhya Viswanathan wrote: >> This PR contains Short Vector Math Library support related changes for >> [JEP-414 Vector API (Second Incubator)](https://openjdk.java.net/jeps/414), >> in preparation for when targeted. >> >> I

Re: RFR: 8265783: Create a separate library for x86 Intel SVML assembly intrinsics [v7]

2021-05-18 Thread Sandhya Viswanathan
s 1.85 > Float64Vector.ASIN 47.30 95.72 ops/ms 2.02 > Float64Vector.ATAN 20.62 49.45 ops/ms 2.40 > Float64Vector.ATAN2 15.95 112.35 ops/ms 7.04 > Float64Vector.CBRT 24.03 134.57 ops/ms 5.60 > Float64Vector.COS 44.28 394.33 ops/ms 8.91 > Float64Vector.COSH 28.35 95.27 ops/ms 3.36 >

Re: RFR: 8265783: Create a separate library for x86 Intel SVML assembly intrinsics [v8]

2021-05-18 Thread Sandhya Viswanathan
s 1.85 > Float64Vector.ASIN 47.30 95.72 ops/ms 2.02 > Float64Vector.ATAN 20.62 49.45 ops/ms 2.40 > Float64Vector.ATAN2 15.95 112.35 ops/ms 7.04 > Float64Vector.CBRT 24.03 134.57 ops/ms 5.60 > Float64Vector.COS 44.28 394.33 ops/ms 8.91 > Float64Vector.COSH 28.35 95.27 ops/ms 3.36 >

Re: RFR: 8265783: Create a separate library for x86 Intel SVML assembly intrinsics [v7]

2021-05-18 Thread Sandhya Viswanathan
On Wed, 19 May 2021 00:26:48 GMT, Vladimir Kozlov wrote: >> Sandhya Viswanathan has updated the pull request incrementally with one >> additional commit since the last revision: >> >> jcheck fixes > > This is much much better! Thank you for changing it. I am onl

Re: RFR: 8265783: Create a separate library for x86 Intel SVML assembly intrinsics [v8]

2021-05-18 Thread Sandhya Viswanathan
On Wed, 19 May 2021 00:58:15 GMT, Sandhya Viswanathan wrote: >> This PR contains Short Vector Math Library support related changes for >> [JEP-414 Vector API (Second Incubator)](https://openjdk.java.net/jeps/414), >> in preparation for when targeted. >> >> I

Re: RFR: 8265783: Create a separate library for x86 Intel SVML assembly intrinsics [v9]

2021-05-18 Thread Sandhya Viswanathan
s 1.85 > Float64Vector.ASIN 47.30 95.72 ops/ms 2.02 > Float64Vector.ATAN 20.62 49.45 ops/ms 2.40 > Float64Vector.ATAN2 15.95 112.35 ops/ms 7.04 > Float64Vector.CBRT 24.03 134.57 ops/ms 5.60 > Float64Vector.COS 44.28 394.33 ops/ms 8.91 > Float64Vector.COSH 28.35 95.27 ops/ms 3.36 >

Re: RFR: 8265783: Create a separate library for x86 Intel SVML assembly intrinsics [v10]

2021-05-19 Thread Sandhya Viswanathan
s 1.85 > Float64Vector.ASIN 47.30 95.72 ops/ms 2.02 > Float64Vector.ATAN 20.62 49.45 ops/ms 2.40 > Float64Vector.ATAN2 15.95 112.35 ops/ms 7.04 > Float64Vector.CBRT 24.03 134.57 ops/ms 5.60 > Float64Vector.COS 44.28 394.33 ops/ms 8.91 > Float64Vector.COSH 28.35 95.27 ops/ms 3.36 >

Re: RFR: 8265783: Create a separate library for x86 Intel SVML assembly intrinsics [v2]

2021-05-19 Thread Sandhya Viswanathan
On Mon, 3 May 2021 21:41:26 GMT, Paul Sandoz wrote: >> Sandhya Viswanathan has updated the pull request with a new target base due >> to a merge or a rebase. The pull request now contains six commits: >> >> - Merge master >> - remove whitespace >

Re: RFR: 8265783: Create a separate library for x86 Intel SVML assembly intrinsics [v11]

2021-05-19 Thread Sandhya Viswanathan
s 1.85 > Float64Vector.ASIN 47.30 95.72 ops/ms 2.02 > Float64Vector.ATAN 20.62 49.45 ops/ms 2.40 > Float64Vector.ATAN2 15.95 112.35 ops/ms 7.04 > Float64Vector.CBRT 24.03 134.57 ops/ms 5.60 > Float64Vector.COS 44.28 394.33 ops/ms 8.91 > Float64Vector.COSH 28.35 95.27 ops/ms 3.36 >

Re: RFR: 8265783: Create a separate library for x86 Intel SVML assembly intrinsics [v2]

2021-05-19 Thread Sandhya Viswanathan
On Wed, 19 May 2021 22:02:14 GMT, Paul Sandoz wrote: >> Tier 1 to 3 tests pass for the default set of build profiles. > >> Thanks a lot for the review @PaulSandoz @iwanowww @erikj79. >> Paul and Vladimir, I have implemented your review comments. Please take a >> look. > > `case VECTOR_OP_OR` is

Re: RFR: 8265783: Create a separate library for x86 Intel SVML assembly intrinsics [v12]

2021-05-19 Thread Sandhya Viswanathan
s 1.85 > Float64Vector.ASIN 47.30 95.72 ops/ms 2.02 > Float64Vector.ATAN 20.62 49.45 ops/ms 2.40 > Float64Vector.ATAN2 15.95 112.35 ops/ms 7.04 > Float64Vector.CBRT 24.03 134.57 ops/ms 5.60 > Float64Vector.COS 44.28 394.33 ops/ms 8.91 > Float64Vector.COSH 28.35 95.27 ops/ms 3.36 >

Re: RFR: 8265783: Create a separate library for x86 Intel SVML assembly intrinsics [v13]

2021-05-19 Thread Sandhya Viswanathan
s 1.85 > Float64Vector.ASIN 47.30 95.72 ops/ms 2.02 > Float64Vector.ATAN 20.62 49.45 ops/ms 2.40 > Float64Vector.ATAN2 15.95 112.35 ops/ms 7.04 > Float64Vector.CBRT 24.03 134.57 ops/ms 5.60 > Float64Vector.COS 44.28 394.33 ops/ms 8.91 > Float64Vector.COSH 28.35 95.27 ops/ms 3.36 >

Re: RFR: 8267190: Optimize Vector API test operations

2021-05-19 Thread Sandhya Viswanathan
On Wed, 19 May 2021 16:51:33 GMT, Paul Sandoz wrote: >> Vector API test operations (IS_DEFAULT, IS_FINITE, IS_INFINITE, IS_NAN and >> IS_NEGATIVE) are computed in three steps: >> 1) reinterpreting the floating point vectors as integral vectors (int/long) >> 2) perform the test in integer domain

Re: RFR: 8267190: Optimize Vector API test operations [v2]

2021-05-19 Thread Sandhya Viswanathan
s > VectorTestPerf.IS_INFINITE 1024 thrpt 5 8932.730 ± 269.988 ops/ms > VectorTestPerf.IS_NAN 1024 thrpt 5 8574.872 ± 498.649 ops/ms > VectorTestPerf.IS_NEGATIVE 1024 thrpt 5 8838.400 ± 11.849 ops/ms > > Best Regards, > Sandhya Sandhya Viswanathan has updated the pull request

Re: RFR: 8267190: Optimize Vector API test operations [v3]

2021-05-20 Thread Sandhya Viswanathan
s > VectorTestPerf.IS_INFINITE 1024 thrpt 5 8932.730 ± 269.988 ops/ms > VectorTestPerf.IS_NAN 1024 thrpt 5 8574.872 ± 498.649 ops/ms > VectorTestPerf.IS_NEGATIVE 1024 thrpt 5 8838.400 ± 11.849 ops/ms > > Best Regards, > Sandhya Sandhya Viswanathan has updated the pull request

Re: RFR: 8267190: Optimize Vector API test operations [v3]

2021-05-20 Thread Sandhya Viswanathan
On Thu, 20 May 2021 23:19:01 GMT, Sandhya Viswanathan wrote: >> Vector API test operations (IS_DEFAULT, IS_FINITE, IS_INFINITE, IS_NAN and >> IS_NEGATIVE) are computed in three steps: >> 1) reinterpreting the floating point vectors as integral vectors (int/long) >>

Integrated: 8267190: Optimize Vector API test operations

2021-05-21 Thread Sandhya Viswanathan
On Fri, 14 May 2021 23:58:38 GMT, Sandhya Viswanathan wrote: > Vector API test operations (IS_DEFAULT, IS_FINITE, IS_INFINITE, IS_NAN and > IS_NEGATIVE) are computed in three steps: > 1) reinterpreting the floating point vectors as integral vectors (int/long) > 2) perform the tes

Re: RFR: 8265783: Create a separate library for x86 Intel SVML assembly intrinsics [v14]

2021-05-25 Thread Sandhya Viswanathan
s 1.85 > Float64Vector.ASIN 47.30 95.72 ops/ms 2.02 > Float64Vector.ATAN 20.62 49.45 ops/ms 2.40 > Float64Vector.ATAN2 15.95 112.35 ops/ms 7.04 > Float64Vector.CBRT 24.03 134.57 ops/ms 5.60 > Float64Vector.COS 44.28 394.33 ops/ms 8.91 > Float64Vector.COSH 28.35 95.27 ops/ms 3.36 >

Re: RFR: 8265783: Create a separate library for x86 Intel SVML assembly intrinsics [v15]

2021-05-25 Thread Sandhya Viswanathan
s 1.85 > Float64Vector.ASIN 47.30 95.72 ops/ms 2.02 > Float64Vector.ATAN 20.62 49.45 ops/ms 2.40 > Float64Vector.ATAN2 15.95 112.35 ops/ms 7.04 > Float64Vector.CBRT 24.03 134.57 ops/ms 5.60 > Float64Vector.COS 44.28 394.33 ops/ms 8.91 > Float64Vector.COSH 28.35 95.27 ops/ms 3.36 >

Re: RFR: 8265783: Create a separate library for x86 Intel SVML assembly intrinsics [v16]

2021-06-02 Thread Sandhya Viswanathan
s 1.85 > Float64Vector.ASIN 47.30 95.72 ops/ms 2.02 > Float64Vector.ATAN 20.62 49.45 ops/ms 2.40 > Float64Vector.ATAN2 15.95 112.35 ops/ms 7.04 > Float64Vector.CBRT 24.03 134.57 ops/ms 5.60 > Float64Vector.COS 44.28 394.33 ops/ms 8.91 > Float64Vector.COSH 28.35 95.27 ops/ms 3.36 >

RFR: 8268151: Vector API toShuffle optimization

2021-06-02 Thread Sandhya Viswanathan
The Vector API toShuffle method can be optimized using existing vector conversion intrinsic. The following changes are made: 1) vector.toShuffle java implementation is changed to call VectorSupport.convert. 2) The conversion intrinsic (inline_vector_convert()) in vectorIntrinsics.cpp is change

Re: RFR: 8265783: Create a separate library for x86 Intel SVML assembly intrinsics [v17]

2021-06-03 Thread Sandhya Viswanathan
s 1.85 > Float64Vector.ASIN 47.30 95.72 ops/ms 2.02 > Float64Vector.ATAN 20.62 49.45 ops/ms 2.40 > Float64Vector.ATAN2 15.95 112.35 ops/ms 7.04 > Float64Vector.CBRT 24.03 134.57 ops/ms 5.60 > Float64Vector.COS 44.28 394.33 ops/ms 8.91 > Float64Vector.COSH 28.35 95.27 ops/ms 3.36 >

Re: RFR: 8268151: Vector API toShuffle optimization

2021-06-03 Thread Sandhya Viswanathan
On Thu, 3 Jun 2021 02:27:35 GMT, Xiaohong Gong wrote: >> The Vector API toShuffle method can be optimized using existing vector >> conversion intrinsic. >> >> The following changes are made: >> 1) vector.toShuffle java implementation is changed to call >> VectorSupport.convert. >> 2) The conv

Integrated: 8265783: Create a separate library for x86 Intel SVML assembly intrinsics

2021-06-03 Thread Sandhya Viswanathan
On Thu, 22 Apr 2021 19:07:28 GMT, Sandhya Viswanathan wrote: > This PR contains Short Vector Math Library support related changes for > [JEP-414 Vector API (Second Incubator)](https://openjdk.java.net/jeps/414), > in preparation for when targeted. > > Intel Short Vector Mat

Re: RFR: 8268151: Vector API toShuffle optimization [v2]

2021-06-03 Thread Sandhya Viswanathan
unnecessary boxing > during back to back vector.toShuffle and shuffle.toVector calls. > > Best Regards, > Sandhya Sandhya Viswanathan has updated the pull request incrementally with one additional commit since the last revision: Implement review comments - Chang

Re: RFR: 8268151: Vector API toShuffle optimization [v2]

2021-06-03 Thread Sandhya Viswanathan
On Thu, 3 Jun 2021 02:31:51 GMT, Xiaohong Gong wrote: >> Sandhya Viswanathan has updated the pull request incrementally with one >> additional commit since the last revision: >> >> Implement review comments > > src/jdk.incubator.vector/share/classes/jdk/incub

Re: RFR: 8268151: Vector API toShuffle optimization [v2]

2021-06-03 Thread Sandhya Viswanathan
On Thu, 3 Jun 2021 22:01:12 GMT, Paul Sandoz wrote: >> Sandhya Viswanathan has updated the pull request incrementally with one >> additional commit since the last revision: >> >> Implement review comments > > Java changes look good. @PaulSand

Re: RFR: 8268151: Vector API toShuffle optimization [v2]

2021-06-04 Thread Sandhya Viswanathan
On Fri, 4 Jun 2021 13:03:24 GMT, Vladimir Ivanov wrote: >> Sandhya Viswanathan has updated the pull request incrementally with one >> additional commit since the last revision: >> >> Implement review comments > > Looks good. > > One inefficiency

Integrated: 8268151: Vector API toShuffle optimization

2021-06-04 Thread Sandhya Viswanathan
On Thu, 3 Jun 2021 00:29:00 GMT, Sandhya Viswanathan wrote: > The Vector API toShuffle method can be optimized using existing vector > conversion intrinsic. > > The following changes are made: > 1) vector.toShuffle java implementation is changed to call > VectorSupport

Re: RFR: 8268276: Base64 Decoding optimization for x86 using AVX-512 [v3]

2021-06-08 Thread Sandhya Viswanathan
On Tue, 8 Jun 2021 00:30:38 GMT, Scott Gibbons wrote: >> Add the Base64 Decode intrinsic for x86 to utilize AVX-512 for acceleration. >> Also allows for performance improvement for non-AVX-512 enabled platforms. >> Due to the nature of MIME-encoded inputs, modify the intrinsic signature to >>

Re: [jdk17] RFR: 8268353: Test libsvml.so is and is not present in jdk image

2021-06-15 Thread Sandhya Viswanathan
On Mon, 14 Jun 2021 16:06:04 GMT, Paul Sandoz wrote: > Test that when the jdk.incubator.vector module is present that libsvml.so is > present, and test the opposite case. Looks good to me. - Marked as reviewed by sviswanathan (Reviewer). PR: https://git.openjdk.java.net/jdk17/pul

Re: [jdk17] RFR: 8266518: Refactor and expand scatter/gather tests

2021-06-16 Thread Sandhya Viswanathan
On Mon, 14 Jun 2021 16:26:17 GMT, Paul Sandoz wrote: > Refactor scatter/gather tests to be included in the load/store test classes > and expand to support access between `ShortVector` and and `char[]`, and > access between `ByteVector` and `boolean[]`. > > Vector tests pass on linux-x64 linux-

Re: [jdk17] RFR: 8266518: Refactor and expand scatter/gather tests [v2]

2021-06-17 Thread Sandhya Viswanathan
On Thu, 17 Jun 2021 15:09:17 GMT, Paul Sandoz wrote: >> Refactor scatter/gather tests to be included in the load/store test classes >> and expand to support access between `ShortVector` and and `char[]`, and >> access between `ByteVector` and `boolean[]`. >> >> Vector tests pass on linux-x64 l

Re: RFR: 8268276: Base64 Decoding optimization for x86 using AVX-512 [v5]

2021-06-18 Thread Sandhya Viswanathan
On Fri, 18 Jun 2021 22:12:11 GMT, Scott Gibbons wrote: >> Add the Base64 Decode intrinsic for x86 to utilize AVX-512 for acceleration. >> Also allows for performance improvement for non-AVX-512 enabled platforms. >> Due to the nature of MIME-encoded inputs, modify the intrinsic signature to >

Re: RFR: 8268276: Base64 Decoding optimization for x86 using AVX-512 [v5]

2021-06-19 Thread Sandhya Viswanathan
On Fri, 18 Jun 2021 22:12:11 GMT, Scott Gibbons wrote: >> Add the Base64 Decode intrinsic for x86 to utilize AVX-512 for acceleration. >> Also allows for performance improvement for non-AVX-512 enabled platforms. >> Due to the nature of MIME-encoded inputs, modify the intrinsic signature to >

Re: RFR: 8268276: Base64 Decoding optimization for x86 using AVX-512 [v6]

2021-06-22 Thread Sandhya Viswanathan
On Tue, 22 Jun 2021 20:47:55 GMT, Scott Gibbons wrote: >> Add the Base64 Decode intrinsic for x86 to utilize AVX-512 for acceleration. >> Also allows for performance improvement for non-AVX-512 enabled platforms. >> Due to the nature of MIME-encoded inputs, modify the intrinsic signature to >

Re: RFR: 8268276: Base64 Decoding optimization for x86 using AVX-512 [v6]

2021-06-22 Thread Sandhya Viswanathan
On Tue, 22 Jun 2021 20:47:55 GMT, Scott Gibbons wrote: >> Add the Base64 Decode intrinsic for x86 to utilize AVX-512 for acceleration. >> Also allows for performance improvement for non-AVX-512 enabled platforms. >> Due to the nature of MIME-encoded inputs, modify the intrinsic signature to >

Re: RFR: 8268276: Base64 Decoding optimization for x86 using AVX-512 [v7]

2021-06-24 Thread Sandhya Viswanathan
On Thu, 24 Jun 2021 14:50:01 GMT, Vladimir Kozlov wrote: >> Scott Gibbons has updated the pull request incrementally with one additional >> commit since the last revision: >> >> Fixing Windows build warnings > > The rest of testing hs-tier1-4 and xcomp is finished and clean. > So this is the

Re: RFR: 8266054: VectorAPI rotate operation optimization [v10]

2021-07-15 Thread Sandhya Viswanathan
On Thu, 15 Jul 2021 08:34:42 GMT, Jatin Bhateja wrote: >> Current VectorAPI Java side implementation expresses rotateLeft and >> rotateRight operation using following operations:- >> >> vec1 = lanewise(VectorOperators.LSHL, n) >> vec2 = lanewise(VectorOperators.LSHR, n) >> res = lan

Re: RFR: 8266054: VectorAPI rotate operation optimization [v10]

2021-07-26 Thread Sandhya Viswanathan
On Sun, 18 Jul 2021 20:22:18 GMT, Jatin Bhateja wrote: >> src/hotspot/share/opto/vectornode.cpp line 1180: >> >>> 1178: cnt = cnt->in(1); >>> 1179: } >>> 1180: shiftRCnt = cnt; >> >> Why do we remove the And with mask here? > > And'ing with shift_mask is already done on Java API s

Re: RFR: 8266054: VectorAPI rotate operation optimization [v13]

2021-07-26 Thread Sandhya Viswanathan
On Tue, 20 Jul 2021 09:57:07 GMT, Jatin Bhateja wrote: >> Current VectorAPI Java side implementation expresses rotateLeft and >> rotateRight operation using following operations:- >> >> vec1 = lanewise(VectorOperators.LSHL, n) >> vec2 = lanewise(VectorOperators.LSHR, n) >> res = lan

Re: RFR: 8266054: VectorAPI rotate operation optimization [v13]

2021-07-27 Thread Sandhya Viswanathan
On Tue, 27 Jul 2021 08:17:55 GMT, Jatin Bhateja wrote: >> src/hotspot/share/opto/vectorIntrinsics.cpp line 1598: >> >>> 1596: cnt = elem_bt == T_LONG ? gvn().transform(new ConvI2LNode(cnt)) >>> : cnt; >>> 1597: opd2 = gvn().transform(VectorNode::scalar2vector(cnt, num_elem, >>> typ

Re: RFR: 8266054: VectorAPI rotate operation optimization [v13]

2021-07-27 Thread Sandhya Viswanathan
On Tue, 20 Jul 2021 09:57:07 GMT, Jatin Bhateja wrote: >> Current VectorAPI Java side implementation expresses rotateLeft and >> rotateRight operation using following operations:- >> >> vec1 = lanewise(VectorOperators.LSHL, n) >> vec2 = lanewise(VectorOperators.LSHR, n) >> res = lan

Re: RFR: 8266054: VectorAPI rotate operation optimization [v13]

2021-07-27 Thread Sandhya Viswanathan
On Tue, 27 Jul 2021 18:05:49 GMT, Sandhya Viswanathan wrote: >> Correcting this, I2L may be needed in auto-vectorization flow since >> Integer/Long.rotate[Right/Left] APIs accept only integral shift, so for >> Long.rotate* operations integral shift value must be conve

Re: RFR: 8266054: VectorAPI rotate operation optimization [v13]

2021-07-28 Thread Sandhya Viswanathan
On Wed, 28 Jul 2021 04:48:35 GMT, Vladimir Kozlov wrote: >> Looks good to me. > > @sviswa7 and @jatin-bhateja jatin-bhateja > The push caused https://bugs.openjdk.java.net/browse/JDK-8271366 > I am strongly suggest in a future to ask an Oracle's engineer to test Intel's > changes before pushing.

RFR: 8272861: Add a micro benchmark for vector api

2021-08-23 Thread Sandhya Viswanathan
This pull request adds a micro benchmark for Vector API. The Black Scholes algorithm is implemented with and without Vector API. We see about ~6x gain with Vector API for this micro benchmark using 256 bit vectors. - Commit messages: - whitespace - 8272861: Add a micro benchmark fo

Re: RFR: 8272861: Add a micro benchmark for vector api [v2]

2021-08-24 Thread Sandhya Viswanathan
> This pull request adds a micro benchmark for Vector API. > The Black Scholes algorithm is implemented with and without Vector API. > We see about ~6x gain with Vector API for this micro benchmark using 256 bit > vectors. Sandhya Viswanathan has updated the pull request incrementa

Re: RFR: 8272861: Add a micro benchmark for vector api [v3]

2021-08-24 Thread Sandhya Viswanathan
> This pull request adds a micro benchmark for Vector API. > The Black Scholes algorithm is implemented with and without Vector API. > We see about ~6x gain with Vector API for this micro benchmark using 256 bit > vectors. Sandhya Viswanathan has updated the pull request incrementa

Re: RFR: 8272861: Add a micro benchmark for vector api [v3]

2021-08-24 Thread Sandhya Viswanathan
On Tue, 24 Aug 2021 10:09:13 GMT, Aleksey Shipilev wrote: >> Sandhya Viswanathan has updated the pull request incrementally with one >> additional commit since the last revision: >> >> Make constants as static final > > Some benchmark comments. @shipile

Re: RFR: 8272861: Add a micro benchmark for vector api [v3]

2021-08-25 Thread Sandhya Viswanathan
On Tue, 24 Aug 2021 20:49:52 GMT, Sandhya Viswanathan wrote: >> This pull request adds a micro benchmark for Vector API. >> The Black Scholes algorithm is implemented with and without Vector API. >> We see about ~6x gain with Vector API for this micro benchmark using 2

Re: RFR: 8272861: Add a micro benchmark for vector api [v4]

2021-08-26 Thread Sandhya Viswanathan
> This pull request adds a micro benchmark for Vector API. > The Black Scholes algorithm is implemented with and without Vector API. > We see about ~6x gain with Vector API for this micro benchmark using 256 bit > vectors. Sandhya Viswanathan has updated the pull request incrementa

Integrated: 8272861: Add a micro benchmark for vector api

2021-08-30 Thread Sandhya Viswanathan
On Mon, 23 Aug 2021 23:18:28 GMT, Sandhya Viswanathan wrote: > This pull request adds a micro benchmark for Vector API. > The Black Scholes algorithm is implemented with and without Vector API. > We see about ~6x gain with Vector API for this micro benchmark using 256 bit > vectors

RFR: 8273450: Fix the copyright header of SVML files

2021-09-07 Thread Sandhya Viswanathan
Fix the copyright header of SVML files to match others. This was brought up on jdk-dev mailing list: https://mail.openjdk.java.net/pipermail/jdk-dev/2021-September/005992.html - Commit messages: - 8273450: Fix the copyright header of SVML file Changes: https://git.openjdk.java.net/

Re: RFR: 8273450: Fix the copyright header of SVML files

2021-09-07 Thread Sandhya Viswanathan
On Tue, 7 Sep 2021 23:08:08 GMT, David Holmes wrote: >> Fix the copyright header of SVML files to match others. >> >> This was brought up on jdk-dev mailing list: >> https://mail.openjdk.java.net/pipermail/jdk-dev/2021-September/005992.html > > Hi Sandhya, > > You must not change another compan

Re: RFR: 8273450: Fix the copyright header of SVML files

2021-09-07 Thread Sandhya Viswanathan
On Tue, 7 Sep 2021 23:39:54 GMT, David Holmes wrote: >> @dholmes-ora I am from Intel so editing the Intel copyright line should be >> ok? > > @sviswa7 My apologies, I hadn't realized you worked for Intel. But note that > other Intel files i.e. ./hotspot/cpu/x86/macroAssembler_x86_*.cpp also do

Re: RFR: 8273450: Fix the copyright header of SVML files

2021-09-08 Thread Sandhya Viswanathan
On Wed, 8 Sep 2021 02:03:12 GMT, Paul Sandoz wrote: >> Fix the copyright header of SVML files to match others. >> >> This was brought up on jdk-dev mailing list: >> https://mail.openjdk.java.net/pipermail/jdk-dev/2021-September/005992.html > > Marked as reviewed by psandoz (Reviewer). Thanks a

Integrated: 8273450: Fix the copyright header of SVML files

2021-09-08 Thread Sandhya Viswanathan
On Tue, 7 Sep 2021 20:25:25 GMT, Sandhya Viswanathan wrote: > Fix the copyright header of SVML files to match others. > > This was brought up on jdk-dev mailing list: > https://mail.openjdk.java.net/pipermail/jdk-dev/2021-September/005992.html This pull request has now bee

Re: RFR: 8274242: Implement fast-path for ASCII-compatible CharsetEncoders on x86

2021-09-24 Thread Sandhya Viswanathan
On Tue, 21 Sep 2021 21:58:48 GMT, Claes Redestad wrote: > This patch extends the `ISO_8859_1.implEncodeISOArray` intrinsic on x86 to > work also for ASCII encoding, which makes for example the `UTF_8$Encoder` > perform on par with (or outperform) similarly getting charset encoded bytes > from

Re: RFR: 8271515: Integration of JEP 417: Vector API (Third Incubator) [v3]

2021-10-18 Thread Sandhya Viswanathan
On Sat, 16 Oct 2021 00:56:14 GMT, Paul Sandoz wrote: >> This PR improves the performance of vector operations that accept masks on >> architectures that support masking in hardware, specifically Intel AVX512 >> and ARM SVE. >> >> On architectures that do not support masking in hardware the sam

Re: RFR: 8271515: Integration of JEP 417: Vector API (Third Incubator) [v3]

2021-10-19 Thread Sandhya Viswanathan
On Tue, 19 Oct 2021 18:54:01 GMT, Paul Sandoz wrote: >> src/hotspot/share/utilities/globalDefinitions_vecApi.hpp line 29: >> >>> 27: // the intent of this file to provide a header that can be included in >>> .s files. >>> 28: >>> 29: #ifndef SHARE_VM_UTILITIES_GLOBALDEFINITIONS_VECAPI_HPP >>

Re: RFR: 8271515: Integration of JEP 417: Vector API (Third Incubator) [v3]

2021-10-19 Thread Sandhya Viswanathan
On Tue, 19 Oct 2021 19:51:54 GMT, Paul Sandoz wrote: >> src/jdk.incubator.vector/share/classes/jdk/incubator/vector/ByteVector.java >> line 603: >> >>> 601: if (opKind(op, VO_SPECIAL)) { >>> 602: if (op == ZOMO) { >>> 603: return blend(broadcast(-1), compare(

Re: RFR: 8271515: Integration of JEP 417: Vector API (Third Incubator) [v4]

2021-10-19 Thread Sandhya Viswanathan
On Tue, 19 Oct 2021 22:37:10 GMT, Paul Sandoz wrote: >> This PR improves the performance of vector operations that accept masks on >> architectures that support masking in hardware, specifically Intel AVX512 >> and ARM SVE. >> >> On architectures that do not support masking in hardware the sam

Re: RFR: 8271515: Integration of JEP 417: Vector API (Third Incubator) [v3]

2021-10-19 Thread Sandhya Viswanathan
On Sat, 16 Oct 2021 00:56:14 GMT, Paul Sandoz wrote: >> This PR improves the performance of vector operations that accept masks on >> architectures that support masking in hardware, specifically Intel AVX512 >> and ARM SVE. >> >> On architectures that do not support masking in hardware the sam

Re: RFR: 8271515: Integration of JEP 417: Vector API (Third Incubator) [v3]

2021-10-19 Thread Sandhya Viswanathan
On Tue, 19 Oct 2021 22:34:13 GMT, Paul Sandoz wrote: >> src/jdk.incubator.vector/share/classes/jdk/incubator/vector/VectorMask.java >> line 574: >> >>> 572: * @throws ClassCastException if the species is wrong >>> 573: */ >>> 574: abstract VectorMask check(Class> >>> maskClass,

  1   2   >