On Fri, 3 Nov 2023 23:20:49 GMT, Sandhya Viswanathan <sviswanat...@openjdk.org> wrote:
>> Jatin Bhateja has updated the pull request incrementally with one additional >> commit since the last revision: >> >> Restricting masked sub-word gather to AVX512 target to align with integral >> gather support. > > src/hotspot/cpu/x86/c2_MacroAssembler_x86.cpp line 1576: > >> 1574: Label* larr[] = { &case0, &case1, &case2, &case3, &case4, &case5, >> &case6, &case7 }; >> 1575: for (int i = 0; i < 8; i++) { >> 1576: bt(mask, midx); > > Could we not use smaller length bt and inc instructions (e.g. 32 bit one) > here as we know that we dont need 64 bits of mask here? That way we will have > smaller instruction encoding. I get your point it may save prefix byte for short vectors in one case, but REX2 may not be avoidable if allocator picks a register from higher register bank (r8-15), mask corresponding to Byte64 does need 64 bits. ------------- PR Review Comment: https://git.openjdk.org/jdk/pull/16354#discussion_r1382573870