On Mon, 15 Dec 2025 07:30:10 GMT, Jatin Bhateja <[email protected]> wrote:
>> src/hotspot/share/opto/vectornode.cpp line 1062:
>>
>>> 1060: if (!in1->isa_Vector()) {
>>> 1061: break;
>>> 1062: }
>>
>> Hi, @jatin-bhateja, I didn't quite understand what you meant. I'm not sure
>> if you mistook `isa_Vector` for `isa_vectormask`. Checking `isa_Vector` here
>> is to ensure that `in1` is a `VectorNode`, so that it calls the `as_Vector`
>> function.
>
> Correct, I am seeing a different behaviour b/w UseAVX=2 and UseAVX=3 for
> following kernel. Not related to your new code but due to other sideeffect.
> kindly have a look.
>
>
> public static final VectorSpecies<Float> FSP =
> FloatVector.SPECIES_PREFERRED;
>
> public static long micro(long ctr) {
> VectorMask<Float> mask = VectorMask.fromLong(FSP, 15);
> return mask.toLong();
> }
>
>
> TURIN>java --add-modules=jdk.incubator.vector -XX:UseAVX=3 -Xbatch
> -XX:-TieredCompilation
> -XX:CompileCommand=PrintIdealPhase,testmcast::micro,BEFORE_MATCHIN
> G -cp . testmcast
> CompileCommand: PrintIdealPhase testmcast.micro const char* PrintIdealPhase =
> 'BEFORE_MATCHING'
> AFTER: BEFORE_MATCHING
> 0 Root === 0 368 [[ 0 1 3 25 ]] inner
> 3 Start === 3 0 [[ 3 5 6 7 8 9 ]] #{0:control, 1:abIO, 2:memory,
> 3:rawptr:BotPTR, 4:return_address, 5:long, 6:half}
> 5 Parm === 3 [[ 368 ]] Control !jvms: testmcast::micro @ bci:-1 (line 9)
> 6 Parm === 3 [[ 368 ]] I_O !jvms: testmcast::micro @ bci:-1 (line 9)
> 7 Parm === 3 [[ 368 ]] Memory Memory: @ptr:BotPTR+bot, idx=Bot; !jvms:
> testmcast::micro @ bci:-1 (line 9)
> 8 Parm === 3 [[ 368 ]] FramePtr !jvms: testmcast::micro @ bci:-1 (line
> 9)
> 9 Parm === 3 [[ 368 ]] ReturnAdr !jvms: testmcast::micro @ bci:-1 (line
> 9)
> 25 ConL === 0 [[ 376 ]] #long:15
> 368 Return === 5 6 7 8 9 returns 398 [[ 0 ]]
> 376 VectorLongToMask === _ 25 [[ 397 ]] #vectormask<F,16> !jvms:
> VectorMask::fromLong @ bci:39 (line 243) testmcast::micro @ bci:6 (line 9)
> 397 VectorMaskCast === _ 376 [[ 398 ]] #vectormask<I,16> !jvms:
> Float512Vector$Float512Mask::toLong @ bci:35 (line 765) testmcast::micro @
> bci:11 (line 10)
> 398 VectorMaskToLong === _ 397 [[ 368 ]] #long !jvms:
> Float512Vector$Float512Mask::toLong @ bci:35 (line 765) testmcast::micro @
> bci:11 (line 10)
> [time] 17ms [res] 300000000
> TURIN>java --add-modules=jdk.incubator.vector -XX:UseAVX=2 -Xbatch
> -XX:-TieredCompilation
> -XX:CompileCommand=PrintIdealPhase,testmcast::micro,BEFORE_MATCHIN
> G -cp . testmcast
> CompileCommand: PrintIdealPhase testmcast.micro const char* PrintIdealPhase =
> 'BEFORE_MATCHING'
> AFTER: BEFORE_MATCHING
> 0 Root === 0 368 [[ 0 1 3 25 ]] inner
> 3 Start === 3 0 [[ 3 5 6 7 8 9 ]] #{0:control, 1:abIO, 2:memory,
> 3:rawptr:BotPTR, 4:return_address, 5:long, 6:half}
> 5 Parm === 3 [[ 368 ]] Control !jvms: testmcast::micro @ bci:-1 (line 9)
> 6 Parm === 3 [[ 368 ]] I_O !jvms: testmcast::micro @ bci:-1 (line 9)
> 7 Parm === 3 [[ 368 ]] Memory ...
This is caused by the different IRs when using AVX2 and AVX3.
- With AVX3 the generated IRs are:
`(VectorMaskToLong (VectorMaskCast (VectorLongToMask x)))`
- With AVX2 the generated IRs are:
`(VectorMaskToLong (VectorStoreMask (VectorMaskCast (VectorLoadMask
VectorLongToMask x)))))`
We have supported the following optimizations:
- `(VectorStoreMask (VectorMaskCast (VectorLoadMask x))) => (x)` and
- `(VectorMaskToLong (VectorLongToMask x)) => (x)`.
So with AVX2,
`(VectorMaskToLong (VectorStoreMask (VectorMaskCast (VectorLoadMask
VectorLongToMask x))))) => (x)`
`(VectorMaskToLong (VectorMaskCast (VectorLongToMask x))) => (x)` is a
potential optimization, I have mentioned this in the commit message. But now we
have not supported it yet.
-------------
PR Review Comment: https://git.openjdk.org/jdk/pull/28313#discussion_r2618369569