>> Seems a bit odd on first sight.  If all we want to do is to
>> select between two masks why do we need a large Pmode mode?

Since we are lowering final mask = vcond_mask_len (mask, 1s, 0s, len, bias),
into:

vid.v v1
vcmp v2
vmsltu.vx  v2, v1, len, TUMU
Then len is Pmode, so we only allow to lower vcond_mask_len with vector mode 
for Pmode.

>> So that's basically a mask-move with length?  Can't this be done
>> differently?  If not, please describe, maybe this is already
>> the shortest way.

We are implementing: final mask = mask[i] && i < len ? 1 : 0
The mask move with length but TUMU, I believe current approach is the optimal 
way.



juzhe.zh...@rivai.ai
 
From: Robin Dapp
Date: 2024-05-14 05:14
To: pan2.li; gcc-patches
CC: rdapp.gcc; juzhe.zhong; kito.cheng; richard.guenther; Tamar.Christina; 
richard.sandiford
Subject: Re: [PATCH v1 2/3] RISC-V: Implement vectorizable early exit with 
vcond_mask_len
Hi Pan,
 
thanks for working on this.
 
In general the patch looks reasonable to me but I'd rather
have some more comments about the high-level idea.
E.g. cbranch is implemented like aarch64 by xor'ing the
bitmasks and comparing the result against zero (so we branch
based on mask equality).
 
> +;; vcond_mask_len
 
High-level description here instead please.
 
> +(define_insn_and_split "vcond_mask_len_<mode>"
> +  [(set (match_operand:VB 0 "register_operand")
 
> +    (unspec: VB [
> +     (match_operand:VB 1 "register_operand")
> +     (match_operand:VB 2 "const_1_operand")
 
I guess it works like that because operand[2] is just implicitly
used anyway but shouldn't that rather be an all_ones_operand?
 
> +   && riscv_vector::get_vector_mode (Pmode, GET_MODE_NUNITS 
> (<MODE>mode)).exists ()"
 
Seems a bit odd on first sight.  If all we want to do is to
select between two masks why do we need a large Pmode mode?
 
> +    rtx ops[] = {operands[0], operands[1], operands[1], cmp, reg, 
> operands[4]};
 
So that's basically a mask-move with length?  Can't this be done
differently?  If not, please describe, maybe this is already
the shortest way.
 
Regards
Robin
 
 

Reply via email to