from:"Quan Anh Mai"

Re: RFR: 8283726: x86_64 intrinsics for compareUnsigned method in Integer and Long

2022-06-08 Thread Quan Anh Mai

On Tue, 7 Jun 2022 17:14:18 GMT, Quan Anh Mai wrote: > Hi, > > This patch implements intrinsics for `Integer/Long::compareUnsigned` using > the same approach as the JVM does for long and floating-point comparisons. > This allows efficient and reliable usage of unsigned com

Re: RFR: 8283726: x86_64 intrinsics for compareUnsigned method in Integer and Long [v2]

2022-06-08 Thread Quan Anh Mai

On Tue, 7 Jun 2022 17:41:13 GMT, Vladimir Kozlov wrote: >> Quan Anh Mai has updated the pull request incrementally with two additional >> commits since the last revision: >> >> - remove comments >> - review comments > > src/hotspot/share/

Re: RFR: 8283726: x86_64 intrinsics for compareUnsigned method in Integer and Long [v2]

2022-06-08 Thread Quan Anh Mai

mportant for range checks such as > discussed in #8620 . > > Thank you very much. Quan Anh Mai has updated the pull request incrementally with two additional commits since the last revision: - remove comments - review comments - Changes: - all: https://git.openjdk.j

RFR: 8283726: x86_64 intrinsics for compare method in Integer and Long

2022-06-07 Thread Quan Anh Mai

Hi, This patch implements intrinsics for `Integer/Long::compareUnsigned` using the same approach as the JVM does for long and floating-point comparisons. This allows efficient and reliable usage of unsigned comparison in Java, which is a basic operation and is important for range checks such

Integrated: 8285973: x86_64: Improve fp comparison and cmove for eq/ne

2022-05-23 Thread Quan Anh Mai

On Wed, 4 May 2022 01:59:17 GMT, Quan Anh Mai wrote: > Hi, > > This patch optimises the matching rules for floating-point comparison with > respects to eq/ne on x86-64 > > 1, When the inputs of a comparison is the same (i.e `isNaN` patterns), `ZF` > is always set, so we

Re: RFR: 8285973: x86_64: Improve fp comparison and cmove for eq/ne [v3]

2022-05-23 Thread Quan Anh Mai

On Sat, 21 May 2022 10:31:25 GMT, Quan Anh Mai wrote: >> Hi, >> >> This patch optimises the matching rules for floating-point comparison with >> respects to eq/ne on x86-64 >> >> 1, When the inputs of a comparison is the same (i.e `isNaN` patterns), `ZF`

Re: RFR: 8285973: x86_64: Improve fp comparison and cmove for eq/ne [v3]

2022-05-21 Thread Quan Anh Mai

On Wed, 4 May 2022 23:27:45 GMT, Vladimir Kozlov wrote: >> The changes to `Float` and `Double` look good. I don't think we need >> additional tests, see test/jdk/java/lang/Math/IeeeRecommendedTests.java. >> >> At first i thought we no longer need PR #8459 but it seems both PRs are >>

Re: RFR: 8285973: x86_64: Improve fp comparison and cmove for eq/ne [v3]

2022-05-21 Thread Quan Anh Mai

nfiniteDouble avgt5 1232.800 ± 31.677 621.185 ± > 11.935 ns/op1.98 > FPComparison.isInfiniteFloat avgt5 1234.708 ± 70.239 623.566 ± > 15.206 ns/op1.98 > FPComparison.isNanDouble avgt5 2255.847 ± 7.238 400.124 ± > 0.762 ns/op

Integrated: 8286182: [BACKOUT] x86: Handle integral division overflow during parsing

2022-05-19 Thread Quan Anh Mai

On Wed, 18 May 2022 15:44:10 GMT, Quan Anh Mai wrote: > This patch backs out the changes made by > [JDK-8285390](https://bugs.openjdk.java.net/browse/JDK-8285390) and > [JDK-8284742](https://bugs.openjdk.java.net/browse/JDK-8284742) since there > are failures due to div nodes fl

Re: RFR: 8286182: [BACKOUT] x86: Handle integral division overflow during parsing [v2]

2022-05-19 Thread Quan Anh Mai

On Thu, 19 May 2022 15:29:29 GMT, Quan Anh Mai wrote: >> This patch backs out the changes made by >> [JDK-8285390](https://bugs.openjdk.java.net/browse/JDK-8285390) and >> [JDK-8284742](https://bugs.openjdk.java.net/browse/JDK-8284742) since there >> are failures d

Re: RFR: 8286182: C2: crash with SIGFPE when executing compiled code [v2]

2022-05-19 Thread Quan Anh Mai

> This patch backs out the changes made by > [JDK-8285390](https://bugs.openjdk.java.net/browse/JDK-8285390) and > [JDK-8284742](https://bugs.openjdk.java.net/browse/JDK-8284742) since there > are failures due to div nodes floating above their validity checks. > > Thanks.

Re: RFR: 8285973: x86_64: Improve fp comparison and cmove for eq/ne [v2]

2022-05-18 Thread Quan Anh Mai

On Wed, 4 May 2022 23:16:41 GMT, Vladimir Kozlov wrote: >> src/hotspot/cpu/x86/x86_64.ad line 6998: >> >>> 6996: ins_encode %{ >>> 6997: __ cmovl(Assembler::parity, $dst$$Register, $src$$Register); >>> 6998: __ cmovl(Assembler::notEqual, $dst$$Register, $src$$Register); >> >> Should

RFR: 8286182: C2: crash with SIGFPE when executing compiled code

2022-05-18 Thread Quan Anh Mai

This patch backs out the changes made by [JDK-8285390](https://bugs.openjdk.java.net/browse/JDK-8285390) and [JDK-8284742](https://bugs.openjdk.java.net/browse/JDK-8284742) since there are failures due to div nodes floating above its validity checks. Thanks. - Commit messages: -

Re: RFR: 8285973: x86_64: Improve fp comparison and cmove for eq/ne

2022-05-18 Thread Quan Anh Mai

On Wed, 4 May 2022 01:59:17 GMT, Quan Anh Mai wrote: > Hi, > > This patch optimises the matching rules for floating-point comparison with > respects to eq/ne on x86-64 > > 1, When the inputs of a comparison is the same (i.e `isNaN` patterns), `ZF` > is always set, so we

Re: RFR: 8285973: x86_64: Improve fp comparison and cmove for eq/ne [v2]

2022-05-18 Thread Quan Anh Mai

op > FPComparison.isFiniteDoubleavgt5 518.309 ± 107.352 ns/op > FPComparison.isFiniteFloat avgt5 515.576 ± 14.669 ns/op > FPComparison.isInfiniteDouble avgt5 621.185 ± 11.935 ns/op > FPComparison.isInfiniteFloat avgt5 623.566 ± 15.

Re: RFR: 8286279: [vectorapi] Only check index of masked lanes if offset is out of array boundary for masked store [v2]

2022-05-12 Thread Quan Anh Mai

On Fri, 13 May 2022 01:35:40 GMT, Xiaohong Gong wrote: >> Checking whether the indexes of masked lanes are inside of the valid memory >> boundary is necessary for masked vector memory access. However, this could >> be saved if the given offset is inside of the vector range that could make >>

Re: RFR: 8286279: [vectorapi] Only check index of masked lanes if offset is out of array boundary for masked store

2022-05-12 Thread Quan Anh Mai

On Fri, 13 May 2022 01:27:18 GMT, Xiaohong Gong wrote: >> Maybe we could use `a.length - vsp.length() > 0 && offset u< a.length - >> vsp.length()` which would hoist the first check outside of the loop. >> Thanks. > >> Maybe we could use `a.length - vsp.length() > 0 && offset u< a.length - >>

Re: RFR: 8286279: [vectorapi] Only check index of masked lanes if offset is out of array boundary for masked store

2022-05-12 Thread Quan Anh Mai

On Tue, 10 May 2022 01:23:55 GMT, Xiaohong Gong wrote: > Checking whether the indexes of masked lanes are inside of the valid memory > boundary is necessary for masked vector memory access. However, this could be > saved if the given offset is inside of the vector range that could make sure >

RFR: 8285973: x86_64: Improve fp comparison and cmove for eq/ne

2022-05-03 Thread Quan Anh Mai

Hi, This patch optimises the matching rules for floating-point comparison with respects to eq/ne on x86-64 1, When the inputs of a comparison is the same (i.e `isNaN` patterns), `ZF` is always set, so we don't need `cmpOpUCF2` for the eq/ne cases, which improves the sequence of `If (CmpF x x)

Re: RFR: 8284932: [Vector API] Incorrect implementation of LSHR operator for negative byte/short elements

2022-04-18 Thread Quan Anh Mai

On Sun, 17 Apr 2022 14:35:14 GMT, Jie Fu wrote: > Hi all, > > According to the Vector API doc, the `LSHR` operator computes > `a>>>(n&(ESIZE*8-1))`. > However, current implementation is incorrect for negative bytes/shorts. > > The background is that one of our customers try to vectorize

Re: RFR: 8284932: [Vector API] Incorrect implementation of LSHR operator for negative byte/short elements

2022-04-18 Thread Quan Anh Mai

On Mon, 18 Apr 2022 08:29:52 GMT, Jie Fu wrote: >>> @DamonFool >>> >>> I think the issue is that these two cases of yours are not equal >>> semantically. >> >> Why? >> According to the vector api doc, they should compute the same value when the >> shift_cnt is 3, right? >> >>> >>> ``` >>>

Re: RFR: 8284932: [Vector API] Incorrect implementation of LSHR operator for negative byte/short elements

2022-04-17 Thread Quan Anh Mai

On Mon, 18 Apr 2022 04:14:39 GMT, Jie Fu wrote: > However, just image that someone would like to optimize some code segments of > bytes/shorts `>>>` Then that person can just use signed shift (`VectorOperators.ASHR`), right? Shifting on masked shift counts means that the shift count cannot be

Re: RFR: 8284932: [Vector API] Incorrect implementation of LSHR operator for negative byte/short elements

2022-04-17 Thread Quan Anh Mai

On Sun, 17 Apr 2022 14:35:14 GMT, Jie Fu wrote: > Hi all, > > According to the Vector API doc, the `LSHR` operator computes > `a>>>(n&(ESIZE*8-1))`. > However, current implementation is incorrect for negative bytes/shorts. > > The background is that one of our customers try to vectorize

Re: RFR: 8284932: [Vector API] Incorrect implementation of LSHR operator for negative byte/short elements

2022-04-17 Thread Quan Anh Mai

On Sun, 17 Apr 2022 14:35:14 GMT, Jie Fu wrote: > Hi all, > > According to the Vector API doc, the `LSHR` operator computes > `a>>>(n&(ESIZE*8-1))`. > However, current implementation is incorrect for negative bytes/shorts. > > The background is that one of our customers try to vectorize

Re: RFR: 8284932: [Vector API] Incorrect implementation of LSHR operator for negative byte/short elements

2022-04-17 Thread Quan Anh Mai

On Sun, 17 Apr 2022 14:35:14 GMT, Jie Fu wrote: > Hi all, > > According to the Vector API doc, the `LSHR` operator computes > `a>>>(n&(ESIZE*8-1))`. > However, current implementation is incorrect for negative bytes/shorts. > > The background is that one of our customers try to vectorize

Re: RFR: 8284932: [Vector API] Incorrect implementation of LSHR operator for negative byte/short elements

2022-04-17 Thread Quan Anh Mai

On Sun, 17 Apr 2022 14:35:14 GMT, Jie Fu wrote: > Hi all, > > According to the Vector API doc, the `LSHR` operator computes > `a>>>(n&(ESIZE*8-1))`. > However, current implementation is incorrect for negative bytes/shorts. > > The background is that one of our customers try to vectorize

Re: RFR: 8284932: [Vector API] Incorrect implementation of LSHR operator for negative byte/short elements

2022-04-17 Thread Quan Anh Mai

On Sun, 17 Apr 2022 14:35:14 GMT, Jie Fu wrote: > Hi all, > > According to the Vector API doc, the `LSHR` operator computes > `a>>>(n&(ESIZE*8-1))`. > However, current implementation is incorrect for negative bytes/shorts. > > The background is that one of our customers try to vectorize

Re: RFR: 8284932: [Vector API] Incorrect implementation of LSHR operator for negative byte/short elements

2022-04-17 Thread Quan Anh Mai

On Sun, 17 Apr 2022 14:35:14 GMT, Jie Fu wrote: > Hi all, > > According to the Vector API doc, the `LSHR` operator computes > `a>>>(n&(ESIZE*8-1))`. > However, current implementation is incorrect for negative bytes/shorts. > > The background is that one of our customers try to vectorize

RFR: 8284742: Handle integral division overflow during parsing

2022-04-12 Thread Quan Anh Mai

Hi, This patch moves the handling of integral division overflow on x86 from code emission time to parsing time. This allows the compiler to perform more efficient transformations and also aids in achieving better code layout. I also removed the handling for division by 10 in the ad file since

Re: RFR: 8282221: x86 intrinsics for divideUnsigned and remainderUnsigned methods in java.lang.Integer and java.lang.Long [v8]

2022-04-08 Thread Quan Anh Mai

On Fri, 8 Apr 2022 16:39:31 GMT, Vladimir Kozlov wrote: >> Hi Vladimir (@vnkozlov), >> >> Incorporated all the suggestions you made in the previous review and pushed >> a new commit. >> Please let me know if anything else is needed. >> >> Thanks, >> Vamsi > > @vamsi-parasa I got failures in

Re: RFR: 8282221: x86 intrinsics for divideUnsigned and remainderUnsigned methods in java.lang.Integer and java.lang.Long [v12]

2022-04-07 Thread Quan Anh Mai

On Fri, 8 Apr 2022 01:05:33 GMT, Srinivas Vamsi Parasa wrote: >> Optimizes the divideUnsigned() and remainderUnsigned() methods in >> java.lang.Integer and java.lang.Long classes using x86 intrinsics. This >> change shows 3x improvement for Integer methods and upto 25% improvement for >>

Re: RFR: 8283667: [vectorapi] Vectorization for masked load with IOOBE with predicate feature

2022-03-30 Thread Quan Anh Mai

On Wed, 30 Mar 2022 10:31:59 GMT, Xiaohong Gong wrote: > Currently the vector load with mask when the given index happens out of the > array boundary is implemented with pure java scalar code to avoid the IOOBE > (IndexOutOfBoundaryException). This is necessary for architectures that do > not

Re: RFR: 8283667: [vectorapi] Vectorization for masked load with IOOBE with predicate feature

2022-03-30 Thread Quan Anh Mai

On Wed, 30 Mar 2022 10:31:59 GMT, Xiaohong Gong wrote: > Currently the vector load with mask when the given index happens out of the > array boundary is implemented with pure java scalar code to avoid the IOOBE > (IndexOutOfBoundaryException). This is necessary for architectures that do > not

Re: RFR: 8283726: x86 intrinsics for compare method in Integer and Long

2022-03-29 Thread Quan Anh Mai

On Tue, 29 Mar 2022 21:56:18 GMT, Vamsi Parasa wrote: >> This is both complicated and inefficient, I would suggest building the >> intrinsic in the IR graph so that the compiler can simplify >> `Integer.compareUnsigned(x, y) < 0` into `x u< y`. Thanks. > >> This is both complicated and

Re: RFR: 8283726: x86 intrinsics for compare method in Integer and Long

2022-03-27 Thread Quan Anh Mai

On Sun, 27 Mar 2022 06:15:34 GMT, Vamsi Parasa wrote: > Implements x86 intrinsics for compare() method in java.lang.Integer and > java.lang.Long. This is both complicated and inefficient, I would suggest building the intrinsic in the IR graph so that the compiler can simplify

Re: RFR: 8279508: Auto-vectorize Math.round API [v15]

2022-03-21 Thread Quan Anh Mai

On Tue, 22 Mar 2022 02:52:07 GMT, Jatin Bhateja wrote: >>> A read from constant table will incur minimum of L1I access penalty to >>> access code blob or at worst even more if data is not present in first >>> level cache >> >> But your approach comes at a cost of frontend bandwidth and port

Re: RFR: 8279508: Auto-vectorize Math.round API [v15]

2022-03-21 Thread Quan Anh Mai

On Mon, 21 Mar 2022 18:25:36 GMT, Jatin Bhateja wrote: > A read from constant table will incur minimum of L1I access penalty to access > code blob or at worst even more if data is not present in first level cache But your approach comes at a cost of frontend bandwidth and port contention,

Re: RFR: 8279508: Auto-vectorize Math.round API [v15]

2022-03-21 Thread Quan Anh Mai

On Sun, 13 Mar 2022 04:27:44 GMT, Jatin Bhateja wrote: >> src/hotspot/cpu/x86/c2_MacroAssembler_x86.cpp line 4178: >> >>> 4176: movl(scratch, 1056964608); >>> 4177: movq(xtmp1, scratch); >>> 4178: vbroadcastss(xtmp1, xtmp1, vec_enc); >> >> You could put the constant in the constant table

Re: RFR: 8283067: Incorrect comment in java.base/share/classes/java/util/ArrayList.java

2022-03-12 Thread Quan Anh Mai

On Sat, 12 Mar 2022 09:48:14 GMT, xpbob wrote: > * Constructs an empty list with an initial capacity of ten > > => > > * Constructs an empty list with default sized empty instances. > > > private static final Object[] DEFAULTCAPACITY_EMPTY_ELEMENTDATA = {}; > >

Re: RFR: 8279508: Auto-vectorize Math.round API [v15]

2022-03-12 Thread Quan Anh Mai

On Sat, 12 Mar 2022 23:22:16 GMT, Quan Anh Mai wrote: >> Jatin Bhateja has updated the pull request incrementally with one additional >> commit since the last revision: >> >> 8279508: Creating separate test for round double under feature check. > > src/hotspot

Re: RFR: 8279508: Auto-vectorize Math.round API [v15]

2022-03-12 Thread Quan Anh Mai

On Sat, 12 Mar 2022 19:58:37 GMT, Jatin Bhateja wrote: >> Summary of changes: >> - Intrinsify Math.round(float) and Math.round(double) APIs. >> - Extend auto-vectorizer to infer vector operations on encountering scalar >> IR nodes for above intrinsics. >> - Test creation using new IR testing

Re: RFR: 8283067: Incorrect comment in java.base/share/classes/java/util/ArrayList.java

2022-03-12 Thread Quan Anh Mai

On Sat, 12 Mar 2022 09:48:14 GMT, xpbob wrote: > * Constructs an empty list with an initial capacity of ten > > => > > * Constructs an empty list with default sized empty instances. > > > private static final Object[] DEFAULTCAPACITY_EMPTY_ELEMENTDATA = {}; > >

Re: RFR: 8282664: Unroll by hand StringUTF16 and StringLatin1 polynomial hash loops [v2]

2022-03-08 Thread Quan Anh Mai

On Fri, 4 Mar 2022 17:44:44 GMT, Ludovic Henry wrote: >> Despite the hash value being cached for Strings, computing the hash still >> represents a significant CPU usage for applications handling lots of text. >> >> Even though it would be generally better to do it through an enhancement to >>

Re: RFR: 8282664: Unroll by hand StringUTF16 and StringLatin1 polynomial hash loops [v2]

2022-03-04 Thread Quan Anh Mai

On Fri, 4 Mar 2022 17:44:44 GMT, Ludovic Henry wrote: >> Despite the hash value being cached for Strings, computing the hash still >> represents a significant CPU usage for applications handling lots of text. >> >> Even though it would be generally better to do it through an enhancement to >>

Integrated: 8282143: Objects.requireNonNull should be ForceInline

2022-03-01 Thread Quan Anh Mai

On Sat, 19 Feb 2022 05:51:52 GMT, Quan Anh Mai wrote: > Hi, > > `Objects.requireNonNull` may fail to be inlined. The call is expensive and > may lead to objects escaping to the heap while the null check is cheap and is > often elided. I have observed this when using the vector

Re: RFR: 8282143: Objects.requireNonNull should be ForceInline [v2]

2022-03-01 Thread Quan Anh Mai

On Tue, 1 Mar 2022 02:22:49 GMT, Quan Anh Mai wrote: >> Hi, >> >> `Objects.requireNonNull` may fail to be inlined. The call is expensive and >> may lead to objects escaping to the heap while the null check is cheap and >> is often elided. I have observed this

Re: RFR: 8282143: Objects.requireNonNull should be ForceInline [v2]

2022-02-28 Thread Quan Anh Mai

On Tue, 1 Mar 2022 02:22:49 GMT, Quan Anh Mai wrote: >> Hi, >> >> `Objects.requireNonNull` may fail to be inlined. The call is expensive and >> may lead to objects escaping to the heap while the null check is cheap and >> is often elided. I have observed this

Re: RFR: 8282143: Objects.requireNonNull should be ForceInline [v2]

2022-02-28 Thread Quan Anh Mai

s to vectors being materialised in a hot loop. > > Should the other `requireNonNull` be `ForceInline` as well? > > Thank you very much. Quan Anh Mai has updated the pull request incrementally with one additional commit since the last revision: the other

Re: RFR: 8279508: Auto-vectorize Math.round API [v9]

2022-02-25 Thread Quan Anh Mai

On Sat, 26 Feb 2022 03:02:51 GMT, Sandhya Viswanathan wrote: >> src/hotspot/cpu/x86/x86.ad line 7263: >> >>> 7261: __ vector_round_float_avx($dst$$XMMRegister, $src$$XMMRegister, >>> $xtmp1$$XMMRegister, >>> 7262: $xtmp2$$XMMRegister, >>>

Re: RFR: 8279508: Auto-vectorize Math.round API [v9]

2022-02-25 Thread Quan Anh Mai

On Sat, 26 Feb 2022 03:37:32 GMT, Quan Anh Mai wrote: >> Clarification, the number in my comments above is (2^w - 1). This is from >> Intel SDM >> (https://www.intel.com/content/www/us/en/developer/articles/technical/intel-sdm.html). >> Also you will need to

RFR: 8282143: Objects.requireNonNull should be ForceInline

2022-02-18 Thread Quan Anh Mai

Hi, `Objects.requireNonNull` may fail to be inlined. The call is expensive and may lead to objects escaping to the heap while the null check is cheap and is often elided. I have observed this when using the vector API when a call to `Objects.requireNonNull` leads to vectors being materialised

Integrated: 8278173: [vectorapi] Add x64 intrinsics for unsigned (zero extended) casts

2022-02-15 Thread Quan Anh Mai

On Sat, 5 Feb 2022 15:34:08 GMT, Quan Anh Mai wrote: > Hi, > > This patch implements the unsigned upcast intrinsics in x86, which are used > in vector lane-wise reinterpreting operations. > > Thank you very much. This pull request has now been integrated. Changeset: 0af356

Re: RFR: 8278173: [vectorapi] Add x64 intrinsics for unsigned (zero extended) casts [v3]

2022-02-14 Thread Quan Anh Mai

On Sun, 13 Feb 2022 05:18:34 GMT, Quan Anh Mai wrote: >> Hi, >> >> This patch implements the unsigned upcast intrinsics in x86, which are used >> in vector lane-wise reinterpreting operations. >> >> Thank you very much. > > Quan Anh Mai has updat

Re: RFR: 8279508: Auto-vectorize Math.round API [v3]

2022-02-13 Thread Quan Anh Mai

On Sun, 13 Feb 2022 03:09:43 GMT, Jatin Bhateja wrote: >> Summary of changes: >> - Intrinsify Math.round(float) and Math.round(double) APIs. >> - Extend auto-vectorizer to infer vector operations on encountering scalar >> IR nodes for above intrinsics. >> - Test creation using new IR testing

Re: RFR: 8278173: [vectorapi] Add x64 intrinsics for unsigned (zero extended) casts [v2]

2022-02-12 Thread Quan Anh Mai

On Thu, 10 Feb 2022 18:55:29 GMT, Paul Sandoz wrote: >> Quan Anh Mai has updated the pull request incrementally with two additional >> commits since the last revision: >> >> - minor rename >> - address reviews > > Obse

Re: RFR: 8278173: [vectorapi] Add x64 intrinsics for unsigned (zero extended) casts [v3]

2022-02-12 Thread Quan Anh Mai

> Hi, > > This patch implements the unsigned upcast intrinsics in x86, which are used > in vector lane-wise reinterpreting operations. > > Thank you very much. Quan Anh Mai has updated the pull request incrementally with one additional commit since the last revision: m

Re: RFR: 8279508: Auto-vectorize Math.round API [v3]

2022-02-12 Thread Quan Anh Mai

On Sun, 13 Feb 2022 03:09:43 GMT, Jatin Bhateja wrote: >> Summary of changes: >> - Intrinsify Math.round(float) and Math.round(double) APIs. >> - Extend auto-vectorizer to infer vector operations on encountering scalar >> IR nodes for above intrinsics. >> - Test creation using new IR testing

Re: RFR: 8278173: [vectorapi] Add x64 intrinsics for unsigned (zero extended) casts [v2]

2022-02-10 Thread Quan Anh Mai

On Thu, 10 Feb 2022 05:05:05 GMT, Jatin Bhateja wrote: >> Quan Anh Mai has updated the pull request incrementally with two additional >> commits since the last revision: >> >> - minor rename >> - address reviews > > src/hotspot/cpu/x86/x86.ad line 7288

Re: RFR: 8278173: [vectorapi] Add x64 intrinsics for unsigned (zero extended) casts [v2]

2022-02-10 Thread Quan Anh Mai

On Wed, 9 Feb 2022 22:52:47 GMT, Sandhya Viswanathan wrote: >> Quan Anh Mai has updated the pull request incrementally with two additional >> commits since the last revision: >> >> - minor rename >> - address reviews > > src/hotspot/cpu/x86/ass

Re: RFR: 8278173: [vectorapi] Add x64 intrinsics for unsigned (zero extended) casts [v2]

2022-02-10 Thread Quan Anh Mai

> Hi, > > This patch implements the unsigned upcast intrinsics in x86, which are used > in vector lane-wise reinterpreting operations. > > Thank you very much. Quan Anh Mai has updated the pull request incrementally with two additional commits since the last revision

RFR: 8278173: [vectorapi] Add x64 intrinsics for unsigned (zero extended) casts

2022-02-05 Thread Quan Anh Mai

Hi, This patch implements the unsigned upcast intrinsics in x86, which are used in vector lane-wise reinterpreting operations. Thank you very much. - Commit messages: - unsigned cast intrinsics Changes: https://git.openjdk.java.net/jdk/pull/7358/files Webrev:

Re: RFR: 8279508: Auto-vectorize Math.round API

2022-01-16 Thread Quan Anh Mai

On Sat, 15 Jan 2022 02:21:38 GMT, Jatin Bhateja wrote: > Summary of changes: > - Intrinsify Math.round(float) and Math.round(double) APIs. > - Extend auto-vectorizer to infer vector operations on encountering scalar IR > nodes for above intrinsics. > - Test creation using new IR testing

Re: RFR: 8279508: Auto-vectorize Math.round API

2022-01-15 Thread Quan Anh Mai

On Sat, 15 Jan 2022 02:21:38 GMT, Jatin Bhateja wrote: > Summary of changes: > - Intrinsify Math.round(float) and Math.round(double) APIs. > - Extend auto-vectorizer to infer vector operations on encountering scalar IR > nodes for above intrinsics. > - Test creation using new IR testing

63 matches

Mail list logo