>> Seems a bit odd on first sight. If all we want to do is to
>> select between two masks why do we need a large Pmode mode?
Since we are lowering final mask = vcond_mask_len (mask, 1s, 0s, len, bias),
into:
vid.v v1
vcmp v2
vmsltu.vx v2, v1, len, TUMU
Then len is Pmode, so we only allow to lower vcond_mask_len with vector mode
for Pmode.
>> So that's basically a mask-move with length? Can't this be done
>> differently? If not, please describe, maybe this is already
>> the shortest way.
We are implementing: final mask = mask[i] && i < len ? 1 : 0
The mask move with length but TUMU, I believe current approach is the optimal
way.
juzhe.zh...@rivai.ai
From: Robin Dapp
Date: 2024-05-14 05:14
To: pan2.li; gcc-patches
CC: rdapp.gcc; juzhe.zhong; kito.cheng; richard.guenther; Tamar.Christina;
richard.sandiford
Subject: Re: [PATCH v1 2/3] RISC-V: Implement vectorizable early exit with
vcond_mask_len
Hi Pan,
thanks for working on this.
In general the patch looks reasonable to me but I'd rather
have some more comments about the high-level idea.
E.g. cbranch is implemented like aarch64 by xor'ing the
bitmasks and comparing the result against zero (so we branch
based on mask equality).
> +;; vcond_mask_len
High-level description here instead please.
> +(define_insn_and_split "vcond_mask_len_"
> + [(set (match_operand:VB 0 "register_operand")
> +(unspec: VB [
> + (match_operand:VB 1 "register_operand")
> + (match_operand:VB 2 "const_1_operand")
I guess it works like that because operand[2] is just implicitly
used anyway but shouldn't that rather be an all_ones_operand?
> + && riscv_vector::get_vector_mode (Pmode, GET_MODE_NUNITS
> (mode)).exists ()"
Seems a bit odd on first sight. If all we want to do is to
select between two masks why do we need a large Pmode mode?
> +rtx ops[] = {operands[0], operands[1], operands[1], cmp, reg,
> operands[4]};
So that's basically a mask-move with length? Can't this be done
differently? If not, please describe, maybe this is already
the shortest way.
Regards
Robin