Re: RFR: 8283726: x86_64 intrinsics for compareUnsigned method in Integer and Long

2022-06-09 Thread Sandhya Viswanathan
On Wed, 8 Jun 2022 09:39:04 GMT, Quan Anh Mai wrote: >> Hi, >> >> This patch implements intrinsics for `Integer/Long::compareUnsigned` using >> the same approach as the JVM does for long and floating-point comparisons. >> This allows efficient and reliable usage of unsigned comparison in

Re: RFR: 8283667: [vectorapi] Vectorization for masked load with IOOBE with predicate feature [v3]

2022-06-02 Thread Sandhya Viswanathan
On Thu, 2 Jun 2022 03:24:07 GMT, Xiaohong Gong wrote: >>> @XiaohongGong Could you please rebase the branch and resolve conflicts? >> >> Sure, I'm working on this now. The patch will be updated soon. Thanks. > >> > @XiaohongGong Could you please rebase the branch and resolve conflicts? >> >>

Re: RFR: 8283667: [vectorapi] Vectorization for masked load with IOOBE with predicate feature [v5]

2022-06-02 Thread Sandhya Viswanathan
On Thu, 2 Jun 2022 03:27:59 GMT, Xiaohong Gong wrote: >> Currently the vector load with mask when the given index happens out of the >> array boundary is implemented with pure java scalar code to avoid the IOOBE >> (IndexOutOfBoundaryException). This is necessary for architectures that do >>

Re: RFR: 8283667: [vectorapi] Vectorization for masked load with IOOBE with predicate feature [v3]

2022-06-01 Thread Sandhya Viswanathan
On Fri, 13 May 2022 08:58:12 GMT, Xiaohong Gong wrote: >> Yes, the tests were run in debug mode. The reporting of the missing constant >> occurs for the compiled method that is called from the method where the >> constants are declared e.g.: >> >> 719 240b

Re: RFR: 8285973: x86_64: Improve fp comparison and cmove for eq/ne [v3]

2022-05-23 Thread Sandhya Viswanathan
On Sat, 21 May 2022 10:31:25 GMT, Quan Anh Mai wrote: >> Hi, >> >> This patch optimises the matching rules for floating-point comparison with >> respects to eq/ne on x86-64 >> >> 1, When the inputs of a comparison is the same (i.e `isNaN` patterns), `ZF` >> is always set, so we don't need

Re: RFR: 8285973: x86_64: Improve fp comparison and cmove for eq/ne [v2]

2022-05-20 Thread Sandhya Viswanathan
On Wed, 4 May 2022 23:16:41 GMT, Vladimir Kozlov wrote: >> src/hotspot/cpu/x86/x86_64.ad line 6998: >> >>> 6996: ins_encode %{ >>> 6997: __ cmovl(Assembler::parity, $dst$$Register, $src$$Register); >>> 6998: __ cmovl(Assembler::notEqual, $dst$$Register, $src$$Register); >> >> Should

Re: RFR: 8285973: x86_64: Improve fp comparison and cmove for eq/ne

2022-05-20 Thread Sandhya Viswanathan
On Wed, 18 May 2022 14:59:33 GMT, Quan Anh Mai wrote: >> Hi, >> >> This patch optimises the matching rules for floating-point comparison with >> respects to eq/ne on x86-64 >> >> 1, When the inputs of a comparison is the same (i.e `isNaN` patterns), `ZF` >> is always set, so we don't need

Re: RFR: 8283667: [vectorapi] Vectorization for masked load with IOOBE with predicate feature [v3]

2022-05-05 Thread Sandhya Viswanathan
On Fri, 6 May 2022 03:47:47 GMT, Xiaohong Gong wrote: >> src/hotspot/share/opto/vectorIntrinsics.cpp line 1238: >> >>> 1236: } else { >>> 1237: // Masked vector load with IOOBE always uses the predicated >>> load. >>> 1238: const TypeInt* offset_in_range = >>>

Re: RFR: 8286029: Add classpath exemption to globals_vectorApiSupport_***.S.inc

2022-05-04 Thread Sandhya Viswanathan
On Mon, 2 May 2022 20:05:36 GMT, Tyler Steele wrote: > Adds missing classpath exception to the header of two GPLv2 files. > > Requested > [here](https://mail.openjdk.java.net/pipermail/jdk-updates-dev/2022-April/013988.html).

Re: RFR: 8286029: Add classpath exemption to globals_vectorApiSupport_***.S.inc

2022-05-04 Thread Sandhya Viswanathan
On Mon, 2 May 2022 20:05:36 GMT, Tyler Steele wrote: > Adds missing classpath exception to the header of two GPLv2 files. > > Requested > [here](https://mail.openjdk.java.net/pipermail/jdk-updates-dev/2022-April/013988.html). Marked as reviewed by sviswanathan (Reviewer). - PR:

Re: RFR: 8283667: [vectorapi] Vectorization for masked load with IOOBE with predicate feature [v2]

2022-04-28 Thread Sandhya Viswanathan
On Fri, 22 Apr 2022 07:08:24 GMT, Xiaohong Gong wrote: >> Currently the vector load with mask when the given index happens out of the >> array boundary is implemented with pure java scalar code to avoid the IOOBE >> (IndexOutOfBoundaryException). This is necessary for architectures that do >>

Re: RFR: 8283667: [vectorapi] Vectorization for masked load with IOOBE with predicate feature [v2]

2022-04-27 Thread Sandhya Viswanathan
On Fri, 22 Apr 2022 07:08:24 GMT, Xiaohong Gong wrote: >> Currently the vector load with mask when the given index happens out of the >> array boundary is implemented with pure java scalar code to avoid the IOOBE >> (IndexOutOfBoundaryException). This is necessary for architectures that do >>

Re: RFR: 8283667: [vectorapi] Vectorization for masked load with IOOBE with predicate feature

2022-04-08 Thread Sandhya Viswanathan
On Wed, 30 Mar 2022 10:31:59 GMT, Xiaohong Gong wrote: > Currently the vector load with mask when the given index happens out of the > array boundary is implemented with pure java scalar code to avoid the IOOBE > (IndexOutOfBoundaryException). This is necessary for architectures that do > not

Re: RFR: 8282221: x86 intrinsics for divideUnsigned and remainderUnsigned methods in java.lang.Integer and java.lang.Long [v12]

2022-04-08 Thread Sandhya Viswanathan
On Fri, 8 Apr 2022 01:05:33 GMT, Srinivas Vamsi Parasa wrote: >> Optimizes the divideUnsigned() and remainderUnsigned() methods in >> java.lang.Integer and java.lang.Long classes using x86 intrinsics. This >> change shows 3x improvement for Integer methods and upto 25% improvement for >>

Re: RFR: 8282221: x86 intrinsics for divideUnsigned and remainderUnsigned methods in java.lang.Integer and java.lang.Long [v8]

2022-04-05 Thread Sandhya Viswanathan
On Tue, 5 Apr 2022 20:26:18 GMT, Vamsi Parasa wrote: >> Optimizes the divideUnsigned() and remainderUnsigned() methods in >> java.lang.Integer and java.lang.Long classes using x86 intrinsics. This >> change shows 3x improvement for Integer methods and upto 25% improvement for >> Long. This

Re: RFR: 8279508: Auto-vectorize Math.round API [v2]

2022-03-11 Thread Sandhya Viswanathan
On Thu, 3 Mar 2022 05:42:23 GMT, Jatin Bhateja wrote: >> The testing for this PR doesn't look adequate to me. I don't see any testing >> for the values where the behavior of round has been redefined at points in >> the last decade. See JDK-8010430 and JDK-6430675, both of which have >>

Re: RFR: 8279508: Auto-vectorize Math.round API [v9]

2022-03-02 Thread Sandhya Viswanathan
On Sat, 26 Feb 2022 01:07:47 GMT, Sandhya Viswanathan wrote: >> Jatin Bhateja has updated the pull request incrementally with one additional >> commit since the last revision: >> >> 8279508: Adding descriptive comments. > > src/hotspot/cpu/x8

Re: RFR: 8279508: Auto-vectorize Math.round API [v9]

2022-03-02 Thread Sandhya Viswanathan
On Sat, 26 Feb 2022 04:55:08 GMT, Jatin Bhateja wrote: >> Jatin Bhateja has updated the pull request incrementally with one additional >> commit since the last revision: >> >> 8279508: Adding descriptive comments. > > As per SDM, if post conversion a floating point number is

Re: RFR: 8279508: Auto-vectorize Math.round API [v11]

2022-03-02 Thread Sandhya Viswanathan
On Wed, 2 Mar 2022 02:44:41 GMT, Jatin Bhateja wrote: >> Summary of changes: >> - Intrinsify Math.round(float) and Math.round(double) APIs. >> - Extend auto-vectorizer to infer vector operations on encountering scalar >> IR nodes for above intrinsics. >> - Test creation using new IR testing

Re: RFR: 8279508: Auto-vectorize Math.round API [v9]

2022-03-02 Thread Sandhya Viswanathan
On Sat, 26 Feb 2022 03:38:32 GMT, Quan Anh Mai wrote: >> I believe the indefinite value should be 2^(w - 1) (a.k.a 0x8000) and >> the documentation is typoed. If you look at `cvtss2si`, the indefinite value >> is also written as 2^w - 1 but yet in `MacroAssembler::convert_f2i` we >>

Re: RFR: 8279508: Auto-vectorize Math.round API [v9]

2022-02-25 Thread Sandhya Viswanathan
On Sat, 26 Feb 2022 01:06:21 GMT, Sandhya Viswanathan wrote: >> Jatin Bhateja has updated the pull request incrementally with one additional >> commit since the last revision: >> >> 8279508: Adding descriptive comments. > > src/hotspot/cpu/x8

Re: RFR: 8279508: Auto-vectorize Math.round API [v9]

2022-02-25 Thread Sandhya Viswanathan
On Fri, 25 Feb 2022 06:22:42 GMT, Jatin Bhateja wrote: >> Summary of changes: >> - Intrinsify Math.round(float) and Math.round(double) APIs. >> - Extend auto-vectorizer to infer vector operations on encountering scalar >> IR nodes for above intrinsics. >> - Test creation using new IR testing

Re: RFR: 8279508: Auto-vectorize Math.round API [v7]

2022-02-23 Thread Sandhya Viswanathan
On Wed, 23 Feb 2022 09:03:37 GMT, Jatin Bhateja wrote: >> Summary of changes: >> - Intrinsify Math.round(float) and Math.round(double) APIs. >> - Extend auto-vectorizer to infer vector operations on encountering scalar >> IR nodes for above intrinsics. >> - Test creation using new IR testing

Re: RFR: 8279508: Auto-vectorize Math.round API [v7]

2022-02-23 Thread Sandhya Viswanathan
On Wed, 23 Feb 2022 09:03:37 GMT, Jatin Bhateja wrote: >> Summary of changes: >> - Intrinsify Math.round(float) and Math.round(double) APIs. >> - Extend auto-vectorizer to infer vector operations on encountering scalar >> IR nodes for above intrinsics. >> - Test creation using new IR testing

Re: RFR: 8279508: Auto-vectorize Math.round API [v7]

2022-02-23 Thread Sandhya Viswanathan
On Wed, 23 Feb 2022 09:03:37 GMT, Jatin Bhateja wrote: >> Summary of changes: >> - Intrinsify Math.round(float) and Math.round(double) APIs. >> - Extend auto-vectorizer to infer vector operations on encountering scalar >> IR nodes for above intrinsics. >> - Test creation using new IR testing

Re: RFR: 8279508: Auto-vectorize Math.round API [v6]

2022-02-22 Thread Sandhya Viswanathan
On Thu, 17 Feb 2022 17:43:43 GMT, Jatin Bhateja wrote: >> Summary of changes: >> - Intrinsify Math.round(float) and Math.round(double) APIs. >> - Extend auto-vectorizer to infer vector operations on encountering scalar >> IR nodes for above intrinsics. >> - Test creation using new IR testing

Re: RFR: 8278173: [vectorapi] Add x64 intrinsics for unsigned (zero extended) casts [v3]

2022-02-14 Thread Sandhya Viswanathan
On Sun, 13 Feb 2022 05:18:34 GMT, Quan Anh Mai wrote: >> Hi, >> >> This patch implements the unsigned upcast intrinsics in x86, which are used >> in vector lane-wise reinterpreting operations. >> >> Thank you very much. > > Quan Anh Mai has updated the pull request incrementally with one

Re: RFR: 8278173: [vectorapi] Add x64 intrinsics for unsigned (zero extended) casts

2022-02-09 Thread Sandhya Viswanathan
On Sat, 5 Feb 2022 15:34:08 GMT, Quan Anh Mai wrote: > Hi, > > This patch implements the unsigned upcast intrinsics in x86, which are used > in vector lane-wise reinterpreting operations. > > Thank you very much. src/hotspot/cpu/x86/assembler_x86.cpp line 4782: > 4780: vector_len ==

Re: RFR: 8279508: Auto-vectorize Math.round API [v2]

2022-01-20 Thread Sandhya Viswanathan
On Wed, 19 Jan 2022 17:38:25 GMT, Jatin Bhateja wrote: >> Summary of changes: >> - Intrinsify Math.round(float) and Math.round(double) APIs. >> - Extend auto-vectorizer to infer vector operations on encountering scalar >> IR nodes for above intrinsics. >> - Test creation using new IR testing

Re: RFR: 8273322: Enhance macro logic optimization for masked logic operations. [v5]

2022-01-06 Thread Sandhya Viswanathan
On Thu, 6 Jan 2022 18:26:32 GMT, Jatin Bhateja wrote: >> Patch extends existing macrologic inferencing algorithm to handle masked >> logic operations. >> >> Existing algorithm: >> >> 1. Identify logic cone roots. >> 2. Packs parent and logic child nodes into a MacroLogic node in bottom up >>

Re: RFR: 8273322: Enhance macro logic optimization for masked logic operations. [v4]

2022-01-06 Thread Sandhya Viswanathan
On Wed, 5 Jan 2022 08:59:00 GMT, Jatin Bhateja wrote: >> Patch extends existing macrologic inferencing algorithm to handle masked >> logic operations. >> >> Existing algorithm: >> >> 1. Identify logic cone roots. >> 2. Packs parent and logic child nodes into a MacroLogic node in bottom up >>

Re: RFR: 8273322: Enhance macro logic optimization for masked logic operations. [v3]

2022-01-04 Thread Sandhya Viswanathan
On Tue, 4 Jan 2022 15:11:47 GMT, Jatin Bhateja wrote: >> Patch extends existing macrologic inferencing algorithm to handle masked >> logic operations. >> >> Existing algorithm: >> >> 1. Identify logic cone roots. >> 2. Packs parent and logic child nodes into a MacroLogic node in bottom up >>

Re: RFR: 8275821: Optimize random number generators developed in JDK-8248862 using Math.unsignedMultiplyHigh() [v4]

2021-12-02 Thread Sandhya Viswanathan
On Thu, 2 Dec 2021 20:43:56 GMT, Vamsi Parasa wrote: >> This change optimizes random number generators using >> Math.unsignedMultiplyHigh() > > Vamsi Parasa has updated the pull request incrementally with one additional > commit since the last revision: > > add seeds for the random

Re: RFR: JDK-8278014: [vectorapi] Remove test run script

2021-11-30 Thread Sandhya Viswanathan
On Tue, 30 Nov 2021 19:22:53 GMT, Paul Sandoz wrote: > Remove Vector API scripts for building and running tests. `jtreg` should be > used instead. > > Also updated the test generation script to remove options that assume > mercurial as the code repository. Looks good to me. -

Re: RFR: 8275167: x86 intrinsic for unsignedMultiplyHigh [v2]

2021-10-20 Thread Sandhya Viswanathan
On Tue, 19 Oct 2021 20:34:55 GMT, Vamsi Parasa wrote: >> Optimize the new Math.unsignedMultiplyHigh using the x86 mul instruction. >> This change show 1.87X improvement on a micro benchmark. > > Vamsi Parasa has updated the pull request incrementally with one additional > commit since the last

Re: RFR: 8275167: x86 intrinsic for unsignedMultiplyHigh [v2]

2021-10-20 Thread Sandhya Viswanathan
On Fri, 15 Oct 2021 20:19:31 GMT, Vladimir Kozlov wrote: >>> How you verified correctness of results? I suggest to extend >>> `test/jdk//java/lang/Math/MultiplicationTests.java` test to cover unsigned >>> method. >> >> Tests for unsignedMultiplyHigh were already added in >>

Re: RFR: 8271515: Integration of JEP 417: Vector API (Third Incubator) [v3]

2021-10-19 Thread Sandhya Viswanathan
On Tue, 19 Oct 2021 22:34:13 GMT, Paul Sandoz wrote: >> src/jdk.incubator.vector/share/classes/jdk/incubator/vector/VectorMask.java >> line 574: >> >>> 572: * @throws ClassCastException if the species is wrong >>> 573: */ >>> 574: abstract VectorMask check(Class> >>> maskClass,

Re: RFR: 8271515: Integration of JEP 417: Vector API (Third Incubator) [v3]

2021-10-19 Thread Sandhya Viswanathan
On Sat, 16 Oct 2021 00:56:14 GMT, Paul Sandoz wrote: >> This PR improves the performance of vector operations that accept masks on >> architectures that support masking in hardware, specifically Intel AVX512 >> and ARM SVE. >> >> On architectures that do not support masking in hardware the

Re: RFR: 8271515: Integration of JEP 417: Vector API (Third Incubator) [v4]

2021-10-19 Thread Sandhya Viswanathan
On Tue, 19 Oct 2021 22:37:10 GMT, Paul Sandoz wrote: >> This PR improves the performance of vector operations that accept masks on >> architectures that support masking in hardware, specifically Intel AVX512 >> and ARM SVE. >> >> On architectures that do not support masking in hardware the

Re: RFR: 8271515: Integration of JEP 417: Vector API (Third Incubator) [v3]

2021-10-19 Thread Sandhya Viswanathan
On Tue, 19 Oct 2021 19:51:54 GMT, Paul Sandoz wrote: >> src/jdk.incubator.vector/share/classes/jdk/incubator/vector/ByteVector.java >> line 603: >> >>> 601: if (opKind(op, VO_SPECIAL)) { >>> 602: if (op == ZOMO) { >>> 603: return blend(broadcast(-1),

Re: RFR: 8271515: Integration of JEP 417: Vector API (Third Incubator) [v3]

2021-10-19 Thread Sandhya Viswanathan
On Tue, 19 Oct 2021 18:54:01 GMT, Paul Sandoz wrote: >> src/hotspot/share/utilities/globalDefinitions_vecApi.hpp line 29: >> >>> 27: // the intent of this file to provide a header that can be included in >>> .s files. >>> 28: >>> 29: #ifndef SHARE_VM_UTILITIES_GLOBALDEFINITIONS_VECAPI_HPP >>

Re: RFR: 8271515: Integration of JEP 417: Vector API (Third Incubator) [v3]

2021-10-18 Thread Sandhya Viswanathan
On Sat, 16 Oct 2021 00:56:14 GMT, Paul Sandoz wrote: >> This PR improves the performance of vector operations that accept masks on >> architectures that support masking in hardware, specifically Intel AVX512 >> and ARM SVE. >> >> On architectures that do not support masking in hardware the

Re: RFR: 8274242: Implement fast-path for ASCII-compatible CharsetEncoders on x86

2021-09-24 Thread Sandhya Viswanathan
On Tue, 21 Sep 2021 21:58:48 GMT, Claes Redestad wrote: > This patch extends the `ISO_8859_1.implEncodeISOArray` intrinsic on x86 to > work also for ASCII encoding, which makes for example the `UTF_8$Encoder` > perform on par with (or outperform) similarly getting charset encoded bytes > from

Integrated: 8273450: Fix the copyright header of SVML files

2021-09-08 Thread Sandhya Viswanathan
On Tue, 7 Sep 2021 20:25:25 GMT, Sandhya Viswanathan wrote: > Fix the copyright header of SVML files to match others. > > This was brought up on jdk-dev mailing list: > https://mail.openjdk.java.net/pipermail/jdk-dev/2021-September/005992.html This pull request has now bee

Re: RFR: 8273450: Fix the copyright header of SVML files

2021-09-08 Thread Sandhya Viswanathan
On Wed, 8 Sep 2021 02:03:12 GMT, Paul Sandoz wrote: >> Fix the copyright header of SVML files to match others. >> >> This was brought up on jdk-dev mailing list: >> https://mail.openjdk.java.net/pipermail/jdk-dev/2021-September/005992.html > > Marked as reviewed by psandoz (Reviewer). Thanks a

Re: RFR: 8273450: Fix the copyright header of SVML files

2021-09-07 Thread Sandhya Viswanathan
On Tue, 7 Sep 2021 23:39:54 GMT, David Holmes wrote: >> @dholmes-ora I am from Intel so editing the Intel copyright line should be >> ok? > > @sviswa7 My apologies, I hadn't realized you worked for Intel. But note that > other Intel files i.e. ./hotspot/cpu/x86/macroAssembler_x86_*.cpp also do

Re: RFR: 8273450: Fix the copyright header of SVML files

2021-09-07 Thread Sandhya Viswanathan
On Tue, 7 Sep 2021 23:08:08 GMT, David Holmes wrote: >> Fix the copyright header of SVML files to match others. >> >> This was brought up on jdk-dev mailing list: >> https://mail.openjdk.java.net/pipermail/jdk-dev/2021-September/005992.html > > Hi Sandhya, > > You must not change another

RFR: 8273450: Fix the copyright header of SVML files

2021-09-07 Thread Sandhya Viswanathan
Fix the copyright header of SVML files to match others. This was brought up on jdk-dev mailing list: https://mail.openjdk.java.net/pipermail/jdk-dev/2021-September/005992.html - Commit messages: - 8273450: Fix the copyright header of SVML file Changes:

Integrated: 8272861: Add a micro benchmark for vector api

2021-08-30 Thread Sandhya Viswanathan
On Mon, 23 Aug 2021 23:18:28 GMT, Sandhya Viswanathan wrote: > This pull request adds a micro benchmark for Vector API. > The Black Scholes algorithm is implemented with and without Vector API. > We see about ~6x gain with Vector API for this micro benchmark using 256 bit > vectors

Re: RFR: 8272861: Add a micro benchmark for vector api [v4]

2021-08-26 Thread Sandhya Viswanathan
> This pull request adds a micro benchmark for Vector API. > The Black Scholes algorithm is implemented with and without Vector API. > We see about ~6x gain with Vector API for this micro benchmark using 256 bit > vectors. Sandhya Viswanathan has updated the pull request incrementa

Re: RFR: 8272861: Add a micro benchmark for vector api [v3]

2021-08-25 Thread Sandhya Viswanathan
On Tue, 24 Aug 2021 20:49:52 GMT, Sandhya Viswanathan wrote: >> This pull request adds a micro benchmark for Vector API. >> The Black Scholes algorithm is implemented with and without Vector API. >> We see about ~6x gain with Vector API for this micro benchmark using 2

Re: RFR: 8272861: Add a micro benchmark for vector api [v3]

2021-08-24 Thread Sandhya Viswanathan
On Tue, 24 Aug 2021 10:09:13 GMT, Aleksey Shipilev wrote: >> Sandhya Viswanathan has updated the pull request incrementally with one >> additional commit since the last revision: >> >> Make constants as static final > > Some benchmark comments.

Re: RFR: 8272861: Add a micro benchmark for vector api [v3]

2021-08-24 Thread Sandhya Viswanathan
> This pull request adds a micro benchmark for Vector API. > The Black Scholes algorithm is implemented with and without Vector API. > We see about ~6x gain with Vector API for this micro benchmark using 256 bit > vectors. Sandhya Viswanathan has updated the pull request incrementa

Re: RFR: 8272861: Add a micro benchmark for vector api [v2]

2021-08-24 Thread Sandhya Viswanathan
> This pull request adds a micro benchmark for Vector API. > The Black Scholes algorithm is implemented with and without Vector API. > We see about ~6x gain with Vector API for this micro benchmark using 256 bit > vectors. Sandhya Viswanathan has updated the pull request incrementa

RFR: 8272861: Add a micro benchmark for vector api

2021-08-23 Thread Sandhya Viswanathan
This pull request adds a micro benchmark for Vector API. The Black Scholes algorithm is implemented with and without Vector API. We see about ~6x gain with Vector API for this micro benchmark using 256 bit vectors. - Commit messages: - whitespace - 8272861: Add a micro benchmark

Re: RFR: 8266054: VectorAPI rotate operation optimization [v13]

2021-07-28 Thread Sandhya Viswanathan
On Wed, 28 Jul 2021 04:48:35 GMT, Vladimir Kozlov wrote: >> Looks good to me. > > @sviswa7 and @jatin-bhateja jatin-bhateja > The push caused https://bugs.openjdk.java.net/browse/JDK-8271366 > I am strongly suggest in a future to ask an Oracle's engineer to test Intel's > changes before

Re: RFR: 8266054: VectorAPI rotate operation optimization [v13]

2021-07-27 Thread Sandhya Viswanathan
On Tue, 27 Jul 2021 18:05:49 GMT, Sandhya Viswanathan wrote: >> Correcting this, I2L may be needed in auto-vectorization flow since >> Integer/Long.rotate[Right/Left] APIs accept only integral shift, so for >> Long.rotate* operations integral shift value must be conve

Re: RFR: 8266054: VectorAPI rotate operation optimization [v13]

2021-07-27 Thread Sandhya Viswanathan
On Tue, 20 Jul 2021 09:57:07 GMT, Jatin Bhateja wrote: >> Current VectorAPI Java side implementation expresses rotateLeft and >> rotateRight operation using following operations:- >> >> vec1 = lanewise(VectorOperators.LSHL, n) >> vec2 = lanewise(VectorOperators.LSHR, n) >> res =

Re: RFR: 8266054: VectorAPI rotate operation optimization [v13]

2021-07-27 Thread Sandhya Viswanathan
On Tue, 27 Jul 2021 08:17:55 GMT, Jatin Bhateja wrote: >> src/hotspot/share/opto/vectorIntrinsics.cpp line 1598: >> >>> 1596: cnt = elem_bt == T_LONG ? gvn().transform(new ConvI2LNode(cnt)) >>> : cnt; >>> 1597: opd2 = gvn().transform(VectorNode::scalar2vector(cnt, num_elem, >>>

Re: RFR: 8266054: VectorAPI rotate operation optimization [v13]

2021-07-26 Thread Sandhya Viswanathan
On Tue, 20 Jul 2021 09:57:07 GMT, Jatin Bhateja wrote: >> Current VectorAPI Java side implementation expresses rotateLeft and >> rotateRight operation using following operations:- >> >> vec1 = lanewise(VectorOperators.LSHL, n) >> vec2 = lanewise(VectorOperators.LSHR, n) >> res =

Re: RFR: 8266054: VectorAPI rotate operation optimization [v10]

2021-07-26 Thread Sandhya Viswanathan
On Sun, 18 Jul 2021 20:22:18 GMT, Jatin Bhateja wrote: >> src/hotspot/share/opto/vectornode.cpp line 1180: >> >>> 1178: cnt = cnt->in(1); >>> 1179: } >>> 1180: shiftRCnt = cnt; >> >> Why do we remove the And with mask here? > > And'ing with shift_mask is already done on Java API

Re: RFR: 8266054: VectorAPI rotate operation optimization [v10]

2021-07-15 Thread Sandhya Viswanathan
On Thu, 15 Jul 2021 08:34:42 GMT, Jatin Bhateja wrote: >> Current VectorAPI Java side implementation expresses rotateLeft and >> rotateRight operation using following operations:- >> >> vec1 = lanewise(VectorOperators.LSHL, n) >> vec2 = lanewise(VectorOperators.LSHR, n) >> res =

Re: RFR: 8268276: Base64 Decoding optimization for x86 using AVX-512 [v7]

2021-06-24 Thread Sandhya Viswanathan
On Thu, 24 Jun 2021 14:50:01 GMT, Vladimir Kozlov wrote: >> Scott Gibbons has updated the pull request incrementally with one additional >> commit since the last revision: >> >> Fixing Windows build warnings > > The rest of testing hs-tier1-4 and xcomp is finished and clean. > So this is the

Re: RFR: 8268276: Base64 Decoding optimization for x86 using AVX-512 [v6]

2021-06-22 Thread Sandhya Viswanathan
On Tue, 22 Jun 2021 20:47:55 GMT, Scott Gibbons wrote: >> Add the Base64 Decode intrinsic for x86 to utilize AVX-512 for acceleration. >> Also allows for performance improvement for non-AVX-512 enabled platforms. >> Due to the nature of MIME-encoded inputs, modify the intrinsic signature to

Re: RFR: 8268276: Base64 Decoding optimization for x86 using AVX-512 [v6]

2021-06-22 Thread Sandhya Viswanathan
On Tue, 22 Jun 2021 20:47:55 GMT, Scott Gibbons wrote: >> Add the Base64 Decode intrinsic for x86 to utilize AVX-512 for acceleration. >> Also allows for performance improvement for non-AVX-512 enabled platforms. >> Due to the nature of MIME-encoded inputs, modify the intrinsic signature to

Re: RFR: 8268276: Base64 Decoding optimization for x86 using AVX-512 [v5]

2021-06-19 Thread Sandhya Viswanathan
On Fri, 18 Jun 2021 22:12:11 GMT, Scott Gibbons wrote: >> Add the Base64 Decode intrinsic for x86 to utilize AVX-512 for acceleration. >> Also allows for performance improvement for non-AVX-512 enabled platforms. >> Due to the nature of MIME-encoded inputs, modify the intrinsic signature to

Re: RFR: 8268276: Base64 Decoding optimization for x86 using AVX-512 [v5]

2021-06-18 Thread Sandhya Viswanathan
On Fri, 18 Jun 2021 22:12:11 GMT, Scott Gibbons wrote: >> Add the Base64 Decode intrinsic for x86 to utilize AVX-512 for acceleration. >> Also allows for performance improvement for non-AVX-512 enabled platforms. >> Due to the nature of MIME-encoded inputs, modify the intrinsic signature to

Re: [jdk17] RFR: 8266518: Refactor and expand scatter/gather tests [v2]

2021-06-17 Thread Sandhya Viswanathan
On Thu, 17 Jun 2021 15:09:17 GMT, Paul Sandoz wrote: >> Refactor scatter/gather tests to be included in the load/store test classes >> and expand to support access between `ShortVector` and and `char[]`, and >> access between `ByteVector` and `boolean[]`. >> >> Vector tests pass on linux-x64

Re: [jdk17] RFR: 8266518: Refactor and expand scatter/gather tests

2021-06-16 Thread Sandhya Viswanathan
On Mon, 14 Jun 2021 16:26:17 GMT, Paul Sandoz wrote: > Refactor scatter/gather tests to be included in the load/store test classes > and expand to support access between `ShortVector` and and `char[]`, and > access between `ByteVector` and `boolean[]`. > > Vector tests pass on linux-x64

Re: [jdk17] RFR: 8268353: Test libsvml.so is and is not present in jdk image

2021-06-15 Thread Sandhya Viswanathan
On Mon, 14 Jun 2021 16:06:04 GMT, Paul Sandoz wrote: > Test that when the jdk.incubator.vector module is present that libsvml.so is > present, and test the opposite case. Looks good to me. - Marked as reviewed by sviswanathan (Reviewer). PR:

Re: RFR: 8268276: Base64 Decoding optimization for x86 using AVX-512 [v3]

2021-06-08 Thread Sandhya Viswanathan
On Tue, 8 Jun 2021 00:30:38 GMT, Scott Gibbons wrote: >> Add the Base64 Decode intrinsic for x86 to utilize AVX-512 for acceleration. >> Also allows for performance improvement for non-AVX-512 enabled platforms. >> Due to the nature of MIME-encoded inputs, modify the intrinsic signature to

Integrated: 8268151: Vector API toShuffle optimization

2021-06-04 Thread Sandhya Viswanathan
On Thu, 3 Jun 2021 00:29:00 GMT, Sandhya Viswanathan wrote: > The Vector API toShuffle method can be optimized using existing vector > conversion intrinsic. > > The following changes are made: > 1) vector.toShuffle java implementation is changed to call > VectorSu

Re: RFR: 8268151: Vector API toShuffle optimization [v2]

2021-06-04 Thread Sandhya Viswanathan
On Fri, 4 Jun 2021 13:03:24 GMT, Vladimir Ivanov wrote: >> Sandhya Viswanathan has updated the pull request incrementally with one >> additional commit since the last revision: >> >> Implement review comments > > Looks good. > > One inefficiency

Re: RFR: 8268151: Vector API toShuffle optimization [v2]

2021-06-03 Thread Sandhya Viswanathan
On Thu, 3 Jun 2021 22:01:12 GMT, Paul Sandoz wrote: >> Sandhya Viswanathan has updated the pull request incrementally with one >> additional commit since the last revision: >> >> Implement review comments > > Java changes look good. @PaulSand

Re: RFR: 8268151: Vector API toShuffle optimization [v2]

2021-06-03 Thread Sandhya Viswanathan
On Thu, 3 Jun 2021 02:31:51 GMT, Xiaohong Gong wrote: >> Sandhya Viswanathan has updated the pull request incrementally with one >> additional commit since the last revision: >> >> Implement review comments > > src/jdk.incubator.vector/share/classes/jdk/incub

Re: RFR: 8268151: Vector API toShuffle optimization [v2]

2021-06-03 Thread Sandhya Viswanathan
unnecessary boxing > during back to back vector.toShuffle and shuffle.toVector calls. > > Best Regards, > Sandhya Sandhya Viswanathan has updated the pull request incrementally with one additional commit since the last revision: Implement review comments - Chang

Integrated: 8265783: Create a separate library for x86 Intel SVML assembly intrinsics

2021-06-03 Thread Sandhya Viswanathan
On Thu, 22 Apr 2021 19:07:28 GMT, Sandhya Viswanathan wrote: > This PR contains Short Vector Math Library support related changes for > [JEP-414 Vector API (Second Incubator)](https://openjdk.java.net/jeps/414), > in preparation for when targeted. > > Intel Short Vector Mat

Re: RFR: 8268151: Vector API toShuffle optimization

2021-06-03 Thread Sandhya Viswanathan
On Thu, 3 Jun 2021 02:27:35 GMT, Xiaohong Gong wrote: >> The Vector API toShuffle method can be optimized using existing vector >> conversion intrinsic. >> >> The following changes are made: >> 1) vector.toShuffle java implementation is changed to call >> VectorSupport.convert. >> 2) The

Re: RFR: 8265783: Create a separate library for x86 Intel SVML assembly intrinsics [v17]

2021-06-03 Thread Sandhya Viswanathan
s 1.85 > Float64Vector.ASIN 47.30 95.72 ops/ms 2.02 > Float64Vector.ATAN 20.62 49.45 ops/ms 2.40 > Float64Vector.ATAN2 15.95 112.35 ops/ms 7.04 > Float64Vector.CBRT 24.03 134.57 ops/ms 5.60 > Float64Vector.COS 44.28 394.33 ops/ms 8.91 > Float64Vector.COSH 28.35 95.27 ops/ms 3.36 >

RFR: 8268151: Vector API toShuffle optimization

2021-06-02 Thread Sandhya Viswanathan
The Vector API toShuffle method can be optimized using existing vector conversion intrinsic. The following changes are made: 1) vector.toShuffle java implementation is changed to call VectorSupport.convert. 2) The conversion intrinsic (inline_vector_convert()) in vectorIntrinsics.cpp is

Re: RFR: 8265783: Create a separate library for x86 Intel SVML assembly intrinsics [v16]

2021-06-02 Thread Sandhya Viswanathan
s 1.85 > Float64Vector.ASIN 47.30 95.72 ops/ms 2.02 > Float64Vector.ATAN 20.62 49.45 ops/ms 2.40 > Float64Vector.ATAN2 15.95 112.35 ops/ms 7.04 > Float64Vector.CBRT 24.03 134.57 ops/ms 5.60 > Float64Vector.COS 44.28 394.33 ops/ms 8.91 > Float64Vector.COSH 28.35 95.27 ops/ms 3.36 >

Re: RFR: 8265783: Create a separate library for x86 Intel SVML assembly intrinsics [v15]

2021-05-25 Thread Sandhya Viswanathan
s 1.85 > Float64Vector.ASIN 47.30 95.72 ops/ms 2.02 > Float64Vector.ATAN 20.62 49.45 ops/ms 2.40 > Float64Vector.ATAN2 15.95 112.35 ops/ms 7.04 > Float64Vector.CBRT 24.03 134.57 ops/ms 5.60 > Float64Vector.COS 44.28 394.33 ops/ms 8.91 > Float64Vector.COSH 28.35 95.27 ops/ms 3.36 >

Re: RFR: 8265783: Create a separate library for x86 Intel SVML assembly intrinsics [v14]

2021-05-25 Thread Sandhya Viswanathan
s 1.85 > Float64Vector.ASIN 47.30 95.72 ops/ms 2.02 > Float64Vector.ATAN 20.62 49.45 ops/ms 2.40 > Float64Vector.ATAN2 15.95 112.35 ops/ms 7.04 > Float64Vector.CBRT 24.03 134.57 ops/ms 5.60 > Float64Vector.COS 44.28 394.33 ops/ms 8.91 > Float64Vector.COSH 28.35 95.27 ops/ms 3.36 >

Integrated: 8267190: Optimize Vector API test operations

2021-05-21 Thread Sandhya Viswanathan
On Fri, 14 May 2021 23:58:38 GMT, Sandhya Viswanathan wrote: > Vector API test operations (IS_DEFAULT, IS_FINITE, IS_INFINITE, IS_NAN and > IS_NEGATIVE) are computed in three steps: > 1) reinterpreting the floating point vectors as integral vectors (int/long) > 2) perform the tes

Re: RFR: 8267190: Optimize Vector API test operations [v3]

2021-05-20 Thread Sandhya Viswanathan
On Thu, 20 May 2021 23:19:01 GMT, Sandhya Viswanathan wrote: >> Vector API test operations (IS_DEFAULT, IS_FINITE, IS_INFINITE, IS_NAN and >> IS_NEGATIVE) are computed in three steps: >> 1) reinterpreting the floating point vectors as integral vectors (int/long) >

Re: RFR: 8267190: Optimize Vector API test operations [v3]

2021-05-20 Thread Sandhya Viswanathan
s > VectorTestPerf.IS_INFINITE 1024 thrpt 5 8932.730 ± 269.988 ops/ms > VectorTestPerf.IS_NAN 1024 thrpt 5 8574.872 ± 498.649 ops/ms > VectorTestPerf.IS_NEGATIVE 1024 thrpt 5 8838.400 ± 11.849 ops/ms > > Best Regards, > Sandhya Sandhya Viswanathan has updated the pull request

Re: RFR: 8267190: Optimize Vector API test operations [v2]

2021-05-19 Thread Sandhya Viswanathan
s > VectorTestPerf.IS_INFINITE 1024 thrpt 5 8932.730 ± 269.988 ops/ms > VectorTestPerf.IS_NAN 1024 thrpt 5 8574.872 ± 498.649 ops/ms > VectorTestPerf.IS_NEGATIVE 1024 thrpt 5 8838.400 ± 11.849 ops/ms > > Best Regards, > Sandhya Sandhya Viswanathan has updated the pull request

Re: RFR: 8267190: Optimize Vector API test operations

2021-05-19 Thread Sandhya Viswanathan
On Wed, 19 May 2021 16:51:33 GMT, Paul Sandoz wrote: >> Vector API test operations (IS_DEFAULT, IS_FINITE, IS_INFINITE, IS_NAN and >> IS_NEGATIVE) are computed in three steps: >> 1) reinterpreting the floating point vectors as integral vectors (int/long) >> 2) perform the test in integer domain

Re: RFR: 8265783: Create a separate library for x86 Intel SVML assembly intrinsics [v13]

2021-05-19 Thread Sandhya Viswanathan
s 1.85 > Float64Vector.ASIN 47.30 95.72 ops/ms 2.02 > Float64Vector.ATAN 20.62 49.45 ops/ms 2.40 > Float64Vector.ATAN2 15.95 112.35 ops/ms 7.04 > Float64Vector.CBRT 24.03 134.57 ops/ms 5.60 > Float64Vector.COS 44.28 394.33 ops/ms 8.91 > Float64Vector.COSH 28.35 95.27 ops/ms 3.36 >

Re: RFR: 8265783: Create a separate library for x86 Intel SVML assembly intrinsics [v12]

2021-05-19 Thread Sandhya Viswanathan
s 1.85 > Float64Vector.ASIN 47.30 95.72 ops/ms 2.02 > Float64Vector.ATAN 20.62 49.45 ops/ms 2.40 > Float64Vector.ATAN2 15.95 112.35 ops/ms 7.04 > Float64Vector.CBRT 24.03 134.57 ops/ms 5.60 > Float64Vector.COS 44.28 394.33 ops/ms 8.91 > Float64Vector.COSH 28.35 95.27 ops/ms 3.36 >

Re: RFR: 8265783: Create a separate library for x86 Intel SVML assembly intrinsics [v2]

2021-05-19 Thread Sandhya Viswanathan
On Wed, 19 May 2021 22:02:14 GMT, Paul Sandoz wrote: >> Tier 1 to 3 tests pass for the default set of build profiles. > >> Thanks a lot for the review @PaulSandoz @iwanowww @erikj79. >> Paul and Vladimir, I have implemented your review comments. Please take a >> look. > > `case VECTOR_OP_OR`

Re: RFR: 8265783: Create a separate library for x86 Intel SVML assembly intrinsics [v11]

2021-05-19 Thread Sandhya Viswanathan
s 1.85 > Float64Vector.ASIN 47.30 95.72 ops/ms 2.02 > Float64Vector.ATAN 20.62 49.45 ops/ms 2.40 > Float64Vector.ATAN2 15.95 112.35 ops/ms 7.04 > Float64Vector.CBRT 24.03 134.57 ops/ms 5.60 > Float64Vector.COS 44.28 394.33 ops/ms 8.91 > Float64Vector.COSH 28.35 95.27 ops/ms 3.36 >

Re: RFR: 8265783: Create a separate library for x86 Intel SVML assembly intrinsics [v2]

2021-05-19 Thread Sandhya Viswanathan
On Mon, 3 May 2021 21:41:26 GMT, Paul Sandoz wrote: >> Sandhya Viswanathan has updated the pull request with a new target base due >> to a merge or a rebase. The pull request now contains six commits: >> >> - Merge master >> - remove whitespace >

Re: RFR: 8265783: Create a separate library for x86 Intel SVML assembly intrinsics [v10]

2021-05-19 Thread Sandhya Viswanathan
s 1.85 > Float64Vector.ASIN 47.30 95.72 ops/ms 2.02 > Float64Vector.ATAN 20.62 49.45 ops/ms 2.40 > Float64Vector.ATAN2 15.95 112.35 ops/ms 7.04 > Float64Vector.CBRT 24.03 134.57 ops/ms 5.60 > Float64Vector.COS 44.28 394.33 ops/ms 8.91 > Float64Vector.COSH 28.35 95.27 ops/ms 3.36 >

Re: RFR: 8265783: Create a separate library for x86 Intel SVML assembly intrinsics [v9]

2021-05-18 Thread Sandhya Viswanathan
s 1.85 > Float64Vector.ASIN 47.30 95.72 ops/ms 2.02 > Float64Vector.ATAN 20.62 49.45 ops/ms 2.40 > Float64Vector.ATAN2 15.95 112.35 ops/ms 7.04 > Float64Vector.CBRT 24.03 134.57 ops/ms 5.60 > Float64Vector.COS 44.28 394.33 ops/ms 8.91 > Float64Vector.COSH 28.35 95.27 ops/ms 3.36 >

Re: RFR: 8265783: Create a separate library for x86 Intel SVML assembly intrinsics [v8]

2021-05-18 Thread Sandhya Viswanathan
On Wed, 19 May 2021 00:58:15 GMT, Sandhya Viswanathan wrote: >> This PR contains Short Vector Math Library support related changes for >> [JEP-414 Vector API (Second Incubator)](https://openjdk.java.net/jeps/414), >> in preparation for when targeted. >> >> I

Re: RFR: 8265783: Create a separate library for x86 Intel SVML assembly intrinsics [v7]

2021-05-18 Thread Sandhya Viswanathan
On Wed, 19 May 2021 00:26:48 GMT, Vladimir Kozlov wrote: >> Sandhya Viswanathan has updated the pull request incrementally with one >> additional commit since the last revision: >> >> jcheck fixes > > This is much much better! Thank you for changing it. I am onl

Re: RFR: 8265783: Create a separate library for x86 Intel SVML assembly intrinsics [v8]

2021-05-18 Thread Sandhya Viswanathan
s 1.85 > Float64Vector.ASIN 47.30 95.72 ops/ms 2.02 > Float64Vector.ATAN 20.62 49.45 ops/ms 2.40 > Float64Vector.ATAN2 15.95 112.35 ops/ms 7.04 > Float64Vector.CBRT 24.03 134.57 ops/ms 5.60 > Float64Vector.COS 44.28 394.33 ops/ms 8.91 > Float64Vector.COSH 28.35 95.27 ops/ms 3.36 >

Re: RFR: 8265783: Create a separate library for x86 Intel SVML assembly intrinsics [v7]

2021-05-18 Thread Sandhya Viswanathan
s 1.85 > Float64Vector.ASIN 47.30 95.72 ops/ms 2.02 > Float64Vector.ATAN 20.62 49.45 ops/ms 2.40 > Float64Vector.ATAN2 15.95 112.35 ops/ms 7.04 > Float64Vector.CBRT 24.03 134.57 ops/ms 5.60 > Float64Vector.COS 44.28 394.33 ops/ms 8.91 > Float64Vector.COSH 28.35 95.27 ops/ms 3.36 >

Re: RFR: 8265783: Create a separate library for x86 Intel SVML assembly intrinsics [v6]

2021-05-18 Thread Sandhya Viswanathan
On Tue, 18 May 2021 23:43:13 GMT, Sandhya Viswanathan wrote: >> This PR contains Short Vector Math Library support related changes for >> [JEP-414 Vector API (Second Incubator)](https://openjdk.java.net/jeps/414), >> in preparation for when targeted. >> >> I

  1   2   >