On Fri, 5 Dec 2025 08:10:32 GMT, Eric Fang <[email protected]> wrote:

>> `VectorMaskCastNode` is used to cast a vector mask from one type to another 
>> type. The cast may be generated by calling the vector API `cast` or 
>> generated by the compiler. For example, some vector mask operations like 
>> `trueCount` require the input mask to be integer types, so for floating 
>> point type masks, the compiler will cast the mask to the corresponding 
>> integer type mask automatically before doing the mask operation. This kind 
>> of cast is very common.
>> 
>> If the vector element size is not changed, the `VectorMaskCastNode` don't 
>> generate code, otherwise code will be generated to extend or narrow the 
>> mask. This IR node is not free no matter it generates code or not because it 
>> may block some optimizations. For example:
>> 1. `(VectorStoremask (VectorMaskCast (VectorLoadMask x)))` The middle 
>> `VectorMaskCast` prevented the following optimization: `(VectorStoremask 
>> (VectorLoadMask x)) => (x)`
>> 2. `(VectorMaskToLong (VectorMaskCast (VectorLongToMask x)))`, which blocks 
>> the optimization `(VectorMaskToLong (VectorLongToMask x)) => (x)`.
>> 
>> In these IR patterns, the value of the input `x` is not changed, so we can 
>> safely do the optimization. But if the input value is changed, we can't 
>> eliminate the cast.
>> 
>> The general idea of this PR is introducing an `uncast_mask` helper function, 
>> which can be used to uncast a chain of `VectorMaskCastNode`, like the 
>> existing `Node::uncast(bool)` function. The funtion returns the first non 
>> `VectorMaskCastNode`.
>> 
>> The intended use case is when the IR pattern to be optimized may contain one 
>> or more consecutive `VectorMaskCastNode` and this does not affect the 
>> correctness of the optimization. Then this function can be called to 
>> eliminate the `VectorMaskCastNode` chain.
>> 
>> Current optimizations related to `VectorMaskCastNode` include:
>> 1. `(VectorMaskCast (VectorMaskCast x)) => (x)`, see JDK-8356760.
>> 2. `(XorV (VectorMaskCast (VectorMaskCmp src1 src2 cond)) (Replicate -1)) => 
>> (VectorMaskCast (VectorMaskCmp src1 src2 ncond))`, see JDK-8354242.
>> 
>> This PR does the following optimizations:
>> 1. Extends the optimization pattern `(VectorMaskCast (VectorMaskCast x)) => 
>> (x)` as `(VectorMaskCast (VectorMaskCast  ... (VectorMaskCast x))) => (x)`. 
>> Because as long as types of the head and tail `VectorMaskCastNode` are 
>> consistent, the optimization is correct.
>> 2. Supports a new optimization pattern `(VectorStoreMask (VectorMaskCast ... 
>> (VectorLoadMask x))) => (x)`. Since the value before and after the pattern 
>> is a boolean vect...
>
> Eric Fang has updated the pull request with a new target base due to a merge 
> or a rebase. The incremental webrev excludes the unrelated changes brought in 
> by the merge/rebase. The pull request contains five additional commits since 
> the last revision:
> 
>  - Refine the test code and comments
>  - Merge branch 'master' into JDK-8370863-mask-cast-opt
>  - Don't read and write the same memory in the JMH benchmarks
>  - Merge branch 'master' into JDK-8370863-mask-cast-opt
>  - 8370863: VectorAPI: Optimize the VectorMaskCast chain in specific patterns
>    
>    `VectorMaskCastNode` is used to cast a vector mask from one type to
>    another type. The cast may be generated by calling the vector API `cast`
>    or generated by the compiler. For example, some vector mask operations
>    like `trueCount` require the input mask to be integer types, so for
>    floating point type masks, the compiler will cast the mask to the
>    corresponding integer type mask automatically before doing the mask
>    operation. This kind of cast is very common.
>    
>    If the vector element size is not changed, the `VectorMaskCastNode`
>    don't generate code, otherwise code will be generated to extend or narrow
>    the mask. This IR node is not free no matter it generates code or not
>    because it may block some optimizations. For example:
>    1. `(VectorStoremask (VectorMaskCast (VectorLoadMask x)))`
>    The middle `VectorMaskCast` prevented the following optimization:
>    `(VectorStoremask (VectorLoadMask x)) => (x)`
>    2. `(VectorMaskToLong (VectorMaskCast (VectorLongToMask x)))`, which
>    blocks the optimization `(VectorMaskToLong (VectorLongToMask x)) => (x)`.
>    
>    In these IR patterns, the value of the input `x` is not changed, so we
>    can safely do the optimization. But if the input value is changed, we
>    can't eliminate the cast.
>    
>    The general idea of this PR is introducing an `uncast_mask` helper
>    function, which can be used to uncast a chain of `VectorMaskCastNode`,
>    like the existing `Node::uncast(bool)` function. The funtion returns
>    the first non `VectorMaskCastNode`.
>    
>    The intended use case is when the IR pattern to be optimized may
>    contain one or more consecutive `VectorMaskCastNode` and this does not
>    affect the correctness of the optimization. Then this function can be
>    called to eliminate the `VectorMaskCastNode` chain.
>    
>    Current optimizations related to `VectorMaskCastNode` include:
>    1. `(VectorMaskCast (VectorMaskCast x)) => (x)`, see JDK-8356760.
>    2. `(XorV...

Thanks for your review! @galderz

-------------

PR Review: https://git.openjdk.org/jdk/pull/28313#pullrequestreview-3537647873

Reply via email to