On Wed, 18 Sep 2024 17:00:30 GMT, Sandhya Viswanathan
<[email protected]> wrote:
>> Currently the rearrange and selectFrom APIs check shuffle indices and throw
>> IndexOutOfBoundsException if there is any exceptional source index in the
>> shuffle. This causes the generated code to be less optimal. This PR modifies
>> the rearrange/selectFrom Vector API methods to perform wrapIndexes instead
>> of checkIndexes and performs optimizations to generate efficient code.
>>
>> Summary of changes is as follows:
>> 1) The rearrange/selectFrom methods do wrapIndexes instead of checkIndexes.
>> 2) Intrinsic for wrapIndexes and selectFrom to generate efficient code
>>
>> For the following source:
>>
>>
>> public void test() {
>> var index = ByteVector.fromArray(bspecies128, shuffles[1], 0);
>> for (int j = 0; j < bspecies128.loopBound(size); j +=
>> bspecies128.length()) {
>> var inpvect = ByteVector.fromArray(bspecies128, byteinp, j);
>> index.selectFrom(inpvect).intoArray(byteres, j);
>> }
>> }
>>
>>
>> The code generated for inner main now looks as follows:
>> ;; B24: # out( B24 B25 ) <- in( B23 B24 ) Loop( B24-B24 inner main of
>> N173 strip mined) Freq: 4160.96
>> 0x00007f40d02274d0: movslq %ebx,%r13
>> 0x00007f40d02274d3: vmovdqu 0x10(%rsi,%r13,1),%xmm1
>> 0x00007f40d02274da: vpshufb %xmm2,%xmm1,%xmm1
>> 0x00007f40d02274df: vmovdqu %xmm1,0x10(%rax,%r13,1)
>> 0x00007f40d02274e6: vmovdqu 0x20(%rsi,%r13,1),%xmm1
>> 0x00007f40d02274ed: vpshufb %xmm2,%xmm1,%xmm1
>> 0x00007f40d02274f2: vmovdqu %xmm1,0x20(%rax,%r13,1)
>> 0x00007f40d02274f9: vmovdqu 0x30(%rsi,%r13,1),%xmm1
>> 0x00007f40d0227500: vpshufb %xmm2,%xmm1,%xmm1
>> 0x00007f40d0227505: vmovdqu %xmm1,0x30(%rax,%r13,1)
>> 0x00007f40d022750c: vmovdqu 0x40(%rsi,%r13,1),%xmm1
>> 0x00007f40d0227513: vpshufb %xmm2,%xmm1,%xmm1
>> 0x00007f40d0227518: vmovdqu %xmm1,0x40(%rax,%r13,1)
>> 0x00007f40d022751f: add $0x40,%ebx
>> 0x00007f40d0227522: cmp %r8d,%ebx
>> 0x00007f40d0227525: jl 0x00007f40d02274d0
>>
>> Best Regards,
>> Sandhya
>
> Sandhya Viswanathan has updated the pull request incrementally with one
> additional commit since the last revision:
>
> Change method name
Hi @sviswa7 , some comments, overall patch looks good to me.
Best Regards,
Jatin
src/hotspot/share/opto/vectorIntrinsics.cpp line 2120:
> 2118:
> 2119: if (vector_klass == nullptr || elem_klass == nullptr || vlen ==
> nullptr) {
> 2120: return false; // dead code
Why dead code in comments ?
src/hotspot/share/opto/vectorIntrinsics.cpp line 2129:
> 2127: NodeClassNames[argument(2)->Opcode()],
> 2128: NodeClassNames[argument(3)->Opcode()]);
> 2129: return false; // not enough info for intrinsification
Please club this with above condition to be consistent with other inline
expanders.
src/hotspot/share/opto/vectorIntrinsics.cpp line 2141:
> 2139: }
> 2140: BasicType elem_bt = elem_type->basic_type();
> 2141:
Remove new line.
src/hotspot/share/opto/vectorIntrinsics.cpp line 2144:
> 2142: int num_elem = vlen->get_con();
> 2143: if ((num_elem < 4) || !is_power_of_2(num_elem)) {
> 2144: log_if_needed(" ** vlen < 4 or not power of two=%d", num_elem);
Will num_elem < 4 not be handled by L2149 since we have an implementation
limitation to support less than 32-bit shuffle / masks.
src/hotspot/share/opto/vectorIntrinsics.cpp line 2171:
> 2169: use_predicate = false;
> 2170: if(!is_masked_op ||
> 2171: (!arch_supports_vector(Op_VectorRearrange, num_elem, elem_bt,
> VecMaskNotUsed) ||
Suggestion:
(!arch_supports_vector(Op_VectorRearrange, num_elem, elem_bt,
VecMaskUseLoad) ||
src/hotspot/share/opto/vectorIntrinsics.cpp line 2188:
> 2186:
> 2187: if (v1 == nullptr || v2 == nullptr) {
> 2188: return false; // operand unboxing failed
To be consistent with other expanders please emit proper error for unboxing
failure like on following line.
https://github.com/openjdk/jdk/blob/master/src/hotspot/share/opto/vectorIntrinsics.cpp#L426
src/hotspot/share/opto/vectorIntrinsics.cpp line 2197:
> 2195: mask = unbox_vector(argument(6), mbox_type, elem_bt, num_elem);
> 2196: if (mask == nullptr) {
> 2197: log_if_needed(" ** not supported: op=selectFrom vlen=%d etype=%s
> is_masked_op=1",
Error should an unboxing failure here.
-------------
PR Review: https://git.openjdk.org/jdk/pull/20634#pullrequestreview-2314643808
PR Review Comment: https://git.openjdk.org/jdk/pull/20634#discussion_r1766277056
PR Review Comment: https://git.openjdk.org/jdk/pull/20634#discussion_r1766277739
PR Review Comment: https://git.openjdk.org/jdk/pull/20634#discussion_r1766278169
PR Review Comment: https://git.openjdk.org/jdk/pull/20634#discussion_r1766297640
PR Review Comment: https://git.openjdk.org/jdk/pull/20634#discussion_r1766292679
PR Review Comment: https://git.openjdk.org/jdk/pull/20634#discussion_r1766303620
PR Review Comment: https://git.openjdk.org/jdk/pull/20634#discussion_r1766304688