Re: RFR: 8278173: [vectorapi] Add x64 intrinsics for unsigned (zero extended) casts [v2]

2022-02-14 Thread Paul Sandoz
On Sun, 13 Feb 2022 05:14:47 GMT, Quan Anh Mai  wrote:

>> Observing the following failures on CPUs with 
>> "Intel_R__Xeon_R__Gold_6354_CPU___3.00GHz" with HotSpot flags:
>> 
>> -XX:+CreateCoredumpOnCrash -ea -esa -XX:CompileThreshold=100 
>> -XX:+UnlockExperimentalVMOptions -server -XX:-TieredCompilation
>> 
>> 
>> TestVectorCastAVX512.java:
>> 
>> Failed IR Rules (1)
>> --
>> - Method "public static void 
>> compiler.vectorapi.reshape.tests.TestVectorCast.testUI256toL512(int[],long[])":
>>   * @IR rule 1: "@compiler.lib.ir_framework.IR(failOn={}, applyIf={}, 
>> applyIfAnd={}, applyIfOr={}, 
>> counts={"(d+(s){2}(VectorUCastI2X.*)+(s){2}===.*)", "1"}, 
>> applyIfNot={})"
>> - counts: Graph contains wrong number of nodes:
>> Regex 1: (\\d+(\\s){2}(VectorUCastI2X.*)+(\\s){2}===.*)
>> Expected 1 but found 0 nodes.
>> 
>> 
>> TestVectorCastAVX1.java:
>> 
>> - Method "public static void 
>> compiler.vectorapi.reshape.tests.TestVectorCast.testUB64toS64(byte[],short[])":
>>   * @IR rule 1: "@compiler.lib.ir_framework.IR(failOn={}, applyIf={}, 
>> applyIfAnd={}, applyIfOr={}, 
>> counts={"(d+(s){2}(VectorUCastB2X.*)+(s){2}===.*)", "1"}, 
>> applyIfNot={})"
>> - counts: Graph contains wrong number of nodes:
>> Regex 1: (\\d+(\\s){2}(VectorUCastB2X.*)+(\\s){2}===.*)
>> Expected 1 but found 0 nodes.
>> 
>> - Method "public static void 
>> compiler.vectorapi.reshape.tests.TestVectorCast.testUB64toI128(byte[],int[])":
>>   * @IR rule 1: "@compiler.lib.ir_framework.IR(failOn={}, applyIf={}, 
>> applyIfAnd={}, applyIfOr={}, 
>> counts={"(d+(s){2}(VectorUCastB2X.*)+(s){2}===.*)", "1"}, 
>> applyIfNot={})"
>> - counts: Graph contains wrong number of nodes:
>> Regex 1: (\\d+(\\s){2}(VectorUCastB2X.*)+(\\s){2}===.*)
>> Expected 1 but found 0 nodes.
>
> @PaulSandoz Thanks a lot for your testing, the reason seems to be due to 
> `LaneType::asIntegral` missing `ForceInline` annotation. I have run the 
> reshape test 10 times without getting any failure while with previous patch 
> there is often 1 or 2.
> Thanks.

@merykitty testing now passes. Java bits look good. Needs HotSpot reviewer.

-

PR: https://git.openjdk.java.net/jdk/pull/7358


Re: RFR: 8278173: [vectorapi] Add x64 intrinsics for unsigned (zero extended) casts [v2]

2022-02-12 Thread Quan Anh Mai
On Thu, 10 Feb 2022 18:55:29 GMT, Paul Sandoz  wrote:

>> Quan Anh Mai has updated the pull request incrementally with two additional 
>> commits since the last revision:
>> 
>>  - minor rename
>>  - address reviews
>
> Observing the following failures on CPUs with 
> "Intel_R__Xeon_R__Gold_6354_CPU___3.00GHz" with HotSpot flags:
> 
> -XX:+CreateCoredumpOnCrash -ea -esa -XX:CompileThreshold=100 
> -XX:+UnlockExperimentalVMOptions -server -XX:-TieredCompilation
> 
> 
> TestVectorCastAVX512.java:
> 
> Failed IR Rules (1)
> --
> - Method "public static void 
> compiler.vectorapi.reshape.tests.TestVectorCast.testUI256toL512(int[],long[])":
>   * @IR rule 1: "@compiler.lib.ir_framework.IR(failOn={}, applyIf={}, 
> applyIfAnd={}, applyIfOr={}, 
> counts={"(d+(s){2}(VectorUCastI2X.*)+(s){2}===.*)", "1"}, 
> applyIfNot={})"
> - counts: Graph contains wrong number of nodes:
> Regex 1: (\\d+(\\s){2}(VectorUCastI2X.*)+(\\s){2}===.*)
> Expected 1 but found 0 nodes.
> 
> 
> TestVectorCastAVX1.java:
> 
> - Method "public static void 
> compiler.vectorapi.reshape.tests.TestVectorCast.testUB64toS64(byte[],short[])":
>   * @IR rule 1: "@compiler.lib.ir_framework.IR(failOn={}, applyIf={}, 
> applyIfAnd={}, applyIfOr={}, 
> counts={"(d+(s){2}(VectorUCastB2X.*)+(s){2}===.*)", "1"}, 
> applyIfNot={})"
> - counts: Graph contains wrong number of nodes:
> Regex 1: (\\d+(\\s){2}(VectorUCastB2X.*)+(\\s){2}===.*)
> Expected 1 but found 0 nodes.
> 
> - Method "public static void 
> compiler.vectorapi.reshape.tests.TestVectorCast.testUB64toI128(byte[],int[])":
>   * @IR rule 1: "@compiler.lib.ir_framework.IR(failOn={}, applyIf={}, 
> applyIfAnd={}, applyIfOr={}, 
> counts={"(d+(s){2}(VectorUCastB2X.*)+(s){2}===.*)", "1"}, 
> applyIfNot={})"
> - counts: Graph contains wrong number of nodes:
> Regex 1: (\\d+(\\s){2}(VectorUCastB2X.*)+(\\s){2}===.*)
> Expected 1 but found 0 nodes.

@PaulSandoz Thanks a lot for your testing, the reason seems to be due to 
`LaneType::asIntegral` missing `ForceInline` annotation. I have run the reshape 
test 10 times without getting any failure while with previous patch there is 
often 1 or 2.
Thanks.

-

PR: https://git.openjdk.java.net/jdk/pull/7358


Re: RFR: 8278173: [vectorapi] Add x64 intrinsics for unsigned (zero extended) casts [v2]

2022-02-10 Thread Paul Sandoz
On Thu, 10 Feb 2022 15:14:44 GMT, Quan Anh Mai  wrote:

>> Hi,
>> 
>> This patch implements the unsigned upcast intrinsics in x86, which are used 
>> in vector lane-wise reinterpreting operations.
>> 
>> Thank you very much.
>
> Quan Anh Mai has updated the pull request incrementally with two additional 
> commits since the last revision:
> 
>  - minor rename
>  - address reviews

Observing the following failures on CPUs with 
"Intel_R__Xeon_R__Gold_6354_CPU___3.00GHz" with HotSpot flags:

-XX:+CreateCoredumpOnCrash -ea -esa -XX:CompileThreshold=100 
-XX:+UnlockExperimentalVMOptions -server -XX:-TieredCompilation


TestVectorCastAVX512.java:

Failed IR Rules (1)
--
- Method "public static void 
compiler.vectorapi.reshape.tests.TestVectorCast.testUI256toL512(int[],long[])":
  * @IR rule 1: "@compiler.lib.ir_framework.IR(failOn={}, applyIf={}, 
applyIfAnd={}, applyIfOr={}, 
counts={"(d+(s){2}(VectorUCastI2X.*)+(s){2}===.*)", "1"}, 
applyIfNot={})"
- counts: Graph contains wrong number of nodes:
Regex 1: (\\d+(\\s){2}(VectorUCastI2X.*)+(\\s){2}===.*)
Expected 1 but found 0 nodes.


TestVectorCastAVX1.java:

- Method "public static void 
compiler.vectorapi.reshape.tests.TestVectorCast.testUB64toS64(byte[],short[])":
  * @IR rule 1: "@compiler.lib.ir_framework.IR(failOn={}, applyIf={}, 
applyIfAnd={}, applyIfOr={}, 
counts={"(d+(s){2}(VectorUCastB2X.*)+(s){2}===.*)", "1"}, 
applyIfNot={})"
- counts: Graph contains wrong number of nodes:
Regex 1: (\\d+(\\s){2}(VectorUCastB2X.*)+(\\s){2}===.*)
Expected 1 but found 0 nodes.

- Method "public static void 
compiler.vectorapi.reshape.tests.TestVectorCast.testUB64toI128(byte[],int[])":
  * @IR rule 1: "@compiler.lib.ir_framework.IR(failOn={}, applyIf={}, 
applyIfAnd={}, applyIfOr={}, 
counts={"(d+(s){2}(VectorUCastB2X.*)+(s){2}===.*)", "1"}, 
applyIfNot={})"
- counts: Graph contains wrong number of nodes:
Regex 1: (\\d+(\\s){2}(VectorUCastB2X.*)+(\\s){2}===.*)
Expected 1 but found 0 nodes.

-

PR: https://git.openjdk.java.net/jdk/pull/7358


Re: RFR: 8278173: [vectorapi] Add x64 intrinsics for unsigned (zero extended) casts [v2]

2022-02-10 Thread Paul Sandoz
On Thu, 10 Feb 2022 15:14:44 GMT, Quan Anh Mai  wrote:

>> Hi,
>> 
>> This patch implements the unsigned upcast intrinsics in x86, which are used 
>> in vector lane-wise reinterpreting operations.
>> 
>> Thank you very much.
>
> Quan Anh Mai has updated the pull request incrementally with two additional 
> commits since the last revision:
> 
>  - minor rename
>  - address reviews

Running some tests.

-

PR: https://git.openjdk.java.net/jdk/pull/7358


Re: RFR: 8278173: [vectorapi] Add x64 intrinsics for unsigned (zero extended) casts [v2]

2022-02-10 Thread Quan Anh Mai
On Thu, 10 Feb 2022 05:05:05 GMT, Jatin Bhateja  wrote:

>> Quan Anh Mai has updated the pull request incrementally with two additional 
>> commits since the last revision:
>> 
>>  - minor rename
>>  - address reviews
>
> src/hotspot/cpu/x86/x86.ad line 7288:
> 
>> 7286: break;
>> 7287:   default: assert(false, "%s", type2name(to_elem_bt));
>> 7288: }
> 
> Please move this into a macro assembly routine.

Fixed, thanks.

-

PR: https://git.openjdk.java.net/jdk/pull/7358


Re: RFR: 8278173: [vectorapi] Add x64 intrinsics for unsigned (zero extended) casts [v2]

2022-02-10 Thread Quan Anh Mai
On Wed, 9 Feb 2022 22:52:47 GMT, Sandhya Viswanathan  
wrote:

>> Quan Anh Mai has updated the pull request incrementally with two additional 
>> commits since the last revision:
>> 
>>  - minor rename
>>  - address reviews
>
> src/hotspot/cpu/x86/assembler_x86.cpp line 4782:
> 
>> 4780:   vector_len == AVX_256bit? VM_Version::supports_avx2() :
>> 4781:   vector_len == AVX_512bit? VM_Version::supports_evex() : 0, " ");
>> 4782:   InstructionAttr attributes(vector_len, /* rex_w */ false, /* 
>> legacy_mode */ _legacy_mode_bw, /* no_mask_reg */ true, /* uses_vl */ true);
> 
> legacy_mode should be false here instead of _legacy_mode_bw.

Fixed, thanks.

-

PR: https://git.openjdk.java.net/jdk/pull/7358


Re: RFR: 8278173: [vectorapi] Add x64 intrinsics for unsigned (zero extended) casts [v2]

2022-02-10 Thread Quan Anh Mai
> Hi,
> 
> This patch implements the unsigned upcast intrinsics in x86, which are used 
> in vector lane-wise reinterpreting operations.
> 
> Thank you very much.

Quan Anh Mai has updated the pull request incrementally with two additional 
commits since the last revision:

 - minor rename
 - address reviews

-

Changes:
  - all: https://git.openjdk.java.net/jdk/pull/7358/files
  - new: https://git.openjdk.java.net/jdk/pull/7358/files/22a70fe1..8028be52

Webrevs:
 - full: https://webrevs.openjdk.java.net/?repo=jdk=7358=01
 - incr: https://webrevs.openjdk.java.net/?repo=jdk=7358=00-01

  Stats: 81 lines in 4 files changed: 32 ins; 44 del; 5 mod
  Patch: https://git.openjdk.java.net/jdk/pull/7358.diff
  Fetch: git fetch https://git.openjdk.java.net/jdk pull/7358/head:pull/7358

PR: https://git.openjdk.java.net/jdk/pull/7358