On Fri, 3 Nov 2023 23:07:44 GMT, Sandhya Viswanathan <sviswanat...@openjdk.org> wrote:
>> Jatin Bhateja has updated the pull request incrementally with one additional >> commit since the last revision: >> >> Restricting masked sub-word gather to AVX512 target to align with integral >> gather support. > > src/hotspot/cpu/x86/c2_MacroAssembler_x86.cpp line 1606: > >> 1604: void C2_MacroAssembler::vpgather8b_offset(BasicType elem_bt, >> XMMRegister dst, Register base, Register idx_base, >> 1605: Register offset, Register >> rtmp, int vlen_enc) { >> 1606: vpxor(dst, dst, dst, vlen_enc); > > Why the vpxor here and in vpgather8b. These are not masked gathers. This is to clear the vector gathering 8 subwords with each iteration. > src/hotspot/cpu/x86/x86.ad line 1921: > >> 1919: case Op_LoadVectorGatherMasked: >> 1920: if (!is_subword_type(bt) && size_in_bits < 512 && >> !VM_Version::supports_avx512vl()) { >> 1921: return false; > > Also need to return false when size_in_bits == 64 for non subword types. match_rule_supported_vector called in the beginning will enforce these checks. > src/hotspot/cpu/x86/x86.ad line 1939: > >> 1937: // fallthrough >> 1938: case Op_LoadVectorGather: >> 1939: if (!is_subword_type(bt) && size_in_bits == 64 ) { > > Since this is a fallthrough case, the change also applies to > StoreVectorScatter*. The !is_subword_type() is needed only for > LoadVectorGather. We do not intrinsify sub-word scatter operations currently. > src/hotspot/cpu/x86/x86.ad line 4074: > >> 4072: BasicType elem_bt = Matcher::vector_element_basic_type(this); >> 4073: assert(!is_subword_type(elem_bt), "sanity"); // T_INT, T_LONG, >> T_FLOAT, T_DOUBLE >> 4074: __ vpcmpeqd($mask$$XMMRegister, $mask$$XMMRegister, >> $mask$$XMMRegister, vlen_enc); > > vpcmpeqd is expensive instruction as compared to movdqu and also unrelated to > subword type support. compare instruction here does not access a memory operand, hence its cheaper compared to memory loads. > src/hotspot/share/opto/vectorIntrinsics.cpp line 1579: > >> 1577: Node* index = argument(11); >> 1578: Node* indexMap = argument(12); >> 1579: Node* indexM = argument(13); > > Could be renamed as follows for better understanding: index -> offset, indexM > -> indexOffset. Also this should be moved under else part of if (is_scatter). Naming scheme is based on the actual names used in intrinsic entry point for these arguments. > src/jdk.incubator.vector/share/classes/jdk/incubator/vector/ByteVector.java > line 3830: > >> 3828: // Check indices are within array bounds. >> 3829: // FIXME: Check index under mask controlling. >> 3830: for (int i = 0; i < vsp.length(); i += lsp.length()) { > > The check (i < vsp.length() ) could instead be (i + lsp.length() <= > vsp.length()). Bit sizes of vector shape is multiple of 128 and we do not support NPOT sizes. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/16354#discussion_r1382571225 PR Review Comment: https://git.openjdk.org/jdk/pull/16354#discussion_r1382571157 PR Review Comment: https://git.openjdk.org/jdk/pull/16354#discussion_r1382571152 PR Review Comment: https://git.openjdk.org/jdk/pull/16354#discussion_r1382571178 PR Review Comment: https://git.openjdk.org/jdk/pull/16354#discussion_r1382571683 PR Review Comment: https://git.openjdk.org/jdk/pull/16354#discussion_r1382571546