On Fri, 20 May 2022 09:51:24 GMT, Jatin Bhateja <jbhat...@openjdk.org> wrote:

>> Hi All,
>> 
>> Patch adds the planned support for new vector operations and APIs targeted 
>> for [JEP 426: Vector API (Fourth 
>> Incubator).](https://bugs.openjdk.java.net/browse/JDK-8280173)
>> 
>> Following is the brief summary of changes:-
>> 
>> 1)  Extends the scope of existing lanewise API for following new vector 
>> operations.
>>    -  VectorOperations.BIT_COUNT: counts the number of one-bits
>>    - VectorOperations.LEADING_ZEROS_COUNT: counts the number of leading zero 
>> bits
>>    - VectorOperations.TRAILING_ZEROS_COUNT: counts the number of trailing 
>> zero bits
>>    - VectorOperations.REVERSE: reversing the order of bits
>>    - VectorOperations.REVERSE_BYTES: reversing the order of bytes
>>    - compress and expand bits: Semantics are based on Hacker's Delight 
>> section 7-4 Compress, or Generalized Extract.
>> 
>> 2)  Adds following new APIs to perform cross lane vector compress and 
>> expansion operations under the influence of a mask.
>>    - Vector.compress
>>    - Vector.expand 
>>    - VectorMask.compress
>> 
>> 3) Adds predicated and non-predicated versions of following new APIs to load 
>> and store the contents of vector from foreign MemorySegments. 
>>   - Vector.fromMemorySegment
>>   - Vector.intoMemorySegment
>> 
>> 4) C2 Compiler IR enhancements and optimized X86 and AARCH64 backend support 
>> for each newly added operation.
>> 
>> 
>>  Patch has been regressed over AARCH64 and X86 targets different AVX levels. 
>> 
>>  Kindly review and share your feedback.
>> 
>>  Best Regards,
>>  Jatin
>
> Jatin Bhateja has updated the pull request incrementally with one additional 
> commit since the last revision:
> 
>   8284960: Integrating incremental patches.

src/hotspot/cpu/x86/assembler_x86.cpp line 7934:

> 7932: 
> 7933: void Assembler::evplzcntd(XMMRegister dst, KRegister mask, XMMRegister 
> src, bool merge, int vector_len) {
> 7934:   assert(VM_Version::supports_avx512cd() && (vector_len == AVX_512bit 
> || VM_Version::supports_avx512vl()), "");

Please, split assert as in other instructions - it will help to understand 
failure better.

src/hotspot/cpu/x86/assembler_x86.cpp line 7946:

> 7944: 
> 7945: void Assembler::evplzcntq(XMMRegister dst, KRegister mask, XMMRegister 
> src, bool merge, int vector_len) {
> 7946:   assert(VM_Version::supports_avx512cd() && (vector_len == AVX_512bit 
> || VM_Version::supports_avx512vl()), "");

Split assert.

src/hotspot/cpu/x86/assembler_x86.cpp line 8173:

> 8171: 
> 8172: void Assembler::vinsertf32x4(XMMRegister dst, XMMRegister nds, 
> XMMRegister src, uint8_t imm8) {
> 8173:   assert(VM_Version::supports_evex(), "");

Hmm, did we never trigger this wrong assert because the use was guarded by 
correct check?

src/hotspot/cpu/x86/assembler_x86.cpp line 11720:

> 11718: 
> 11719: void Assembler::evpcompressb(XMMRegister dst, KRegister mask, 
> XMMRegister src, bool merge, int vector_len) {
> 11720:   assert(VM_Version::supports_avx512_vbmi2() && (vector_len == 
> AVX_512bit || VM_Version::supports_avx512vl()), "");

Split assert in this and following new instructions.

src/hotspot/cpu/x86/c2_MacroAssembler_x86.cpp line 4455:

> 4453:       break;
> 4454:     default:
> 4455:       fatal("Unsupported type");

Print wrong type: `fatal("Unsupported type : %s", type2name(type));`
Below too.

src/hotspot/cpu/x86/c2_MacroAssembler_x86.cpp line 4561:

> 4559:       case 4 : evpbroadcastd(dst, rtmp, vec_enc); break;
> 4560:       case 8 : evpbroadcastq(dst, rtmp, vec_enc); break;
> 4561:       default : ShouldNotReachHere(); break;

`ShouldNotReachHere` does not give any information in case of failure. Use 
`fatal()` which prints wrong `lane_size`.
Same below.

src/hotspot/cpu/x86/c2_MacroAssembler_x86.cpp line 4666:

> 4664:       break;
> 4665:     default:
> 4666:       ShouldNotReachHere();

Use `fatal()`.

src/hotspot/cpu/x86/c2_MacroAssembler_x86.cpp line 4693:

> 4691:       break;
> 4692:     default:
> 4693:       ShouldNotReachHere();

Use `fatal()`.

src/hotspot/cpu/x86/c2_MacroAssembler_x86.cpp line 4732:

> 4730:     vector_reverse_byte(bt, dst, xtmp2, rtmp, vec_enc);
> 4731: 
> 4732:   } else if(!VM_Version::supports_avx512vlbw() && vec_enc == 
> Assembler::AVX_512bit) {

No need to check `!VM_Version::supports_avx512vlbw()`.

src/hotspot/cpu/x86/c2_MacroAssembler_x86.cpp line 4759:

> 4757:     vpandn(xtmp2, xtmp2, xtmp1, vec_enc);
> 4758:     vpsrlq(xtmp2, xtmp2, 1, vec_enc);
> 4759:     vporq(xtmp1, dst, xtmp2, vec_enc);

All 3 code snippets are the same except constants. Also similar code in 
`vector_reverse_byte64` for `short` type.
Consider factoring out it into separate method.

src/hotspot/cpu/x86/c2_MacroAssembler_x86.cpp line 4819:

> 4817:       break;
> 4818:     default:
> 4819:       fatal("Unsupported type");

Print wring type.

src/hotspot/cpu/x86/c2_MacroAssembler_x86.cpp line 4846:

> 4844:       break;
> 4845:     default:
> 4846:       fatal("Unsupported type");

Print wring type.

src/hotspot/cpu/x86/c2_MacroAssembler_x86.cpp line 4892:

> 4890:       break;
> 4891:     default:
> 4892:       ShouldNotReachHere();

Use `fatal` and print type.

src/hotspot/cpu/x86/c2_MacroAssembler_x86.cpp line 5018:

> 5016:       break;
> 5017:     default:
> 5018:       ShouldNotReachHere();

Use fatal and print type.

src/hotspot/cpu/x86/c2_MacroAssembler_x86.cpp line 5037:

> 5035:       break;
> 5036:     default:
> 5037:       ShouldNotReachHere();

Use fatal and print type.

src/hotspot/cpu/x86/c2_MacroAssembler_x86.cpp line 5056:

> 5054:       break;
> 5055:     default:
> 5056:       ShouldNotReachHere();

Use fatal and print type.

src/hotspot/cpu/x86/x86.ad line 8879:

> 8877:     // special handling should be removed.
> 8878:     if (bt == T_LONG && rbt == T_INT) {
> 8879:       if (VM_Version::supports_avx512vl()) {

Predicate say `!VM_Version::supports_avx512vl()`

src/hotspot/share/opto/node.hpp line 1006:

> 1004: 
> 1005:   // The node is a CountedLoopEnd with a mask annotation so as to emit 
> a restore context
> 1006:   bool has_vector_mask_set() const { return (_flags & 
> Flag_has_vector_mask_set) != 0; }

I don't see use of this flag.

src/hotspot/share/opto/vectorIntrinsics.cpp line 86:

> 84:     if ((mask_use_type & VecMaskUseLoad) != 0) {
> 85:       if (!Matcher::match_rule_supported_vector(Op_VectorLoadMask, 
> num_elem, elem_bt) ||
> 86:           !Matcher::match_rule_supported_vector(Op_LoadVector, num_elem, 
> T_BOOLEAN)) {

Add comment explaining new check. In follow ing places too.

src/hotspot/share/runtime/vmStructs.cpp line 1779:

> 1777:   declare_c2_type(CMoveVDNode, VectorNode)                              
>   \
> 1778:   declare_c2_type(CompressVNode, VectorNode)                            
>   \
> 1779:   declare_c2_type(ExpandVNode, VectorNode)                              
>   \

Not all new nodes listed.

-------------

PR: https://git.openjdk.java.net/jdk/pull/8425

Reply via email to