>> Seems a bit odd on first sight. If all we want to do is to >> select between two masks why do we need a large Pmode mode?
Since we are lowering final mask = vcond_mask_len (mask, 1s, 0s, len, bias), into: vid.v v1 vcmp v2 vmsltu.vx v2, v1, len, TUMU Then len is Pmode, so we only allow to lower vcond_mask_len with vector mode for Pmode. >> So that's basically a mask-move with length? Can't this be done >> differently? If not, please describe, maybe this is already >> the shortest way. We are implementing: final mask = mask[i] && i < len ? 1 : 0 The mask move with length but TUMU, I believe current approach is the optimal way. juzhe.zh...@rivai.ai From: Robin Dapp Date: 2024-05-14 05:14 To: pan2.li; gcc-patches CC: rdapp.gcc; juzhe.zhong; kito.cheng; richard.guenther; Tamar.Christina; richard.sandiford Subject: Re: [PATCH v1 2/3] RISC-V: Implement vectorizable early exit with vcond_mask_len Hi Pan, thanks for working on this. In general the patch looks reasonable to me but I'd rather have some more comments about the high-level idea. E.g. cbranch is implemented like aarch64 by xor'ing the bitmasks and comparing the result against zero (so we branch based on mask equality). > +;; vcond_mask_len High-level description here instead please. > +(define_insn_and_split "vcond_mask_len_<mode>" > + [(set (match_operand:VB 0 "register_operand") > + (unspec: VB [ > + (match_operand:VB 1 "register_operand") > + (match_operand:VB 2 "const_1_operand") I guess it works like that because operand[2] is just implicitly used anyway but shouldn't that rather be an all_ones_operand? > + && riscv_vector::get_vector_mode (Pmode, GET_MODE_NUNITS > (<MODE>mode)).exists ()" Seems a bit odd on first sight. If all we want to do is to select between two masks why do we need a large Pmode mode? > + rtx ops[] = {operands[0], operands[1], operands[1], cmp, reg, > operands[4]}; So that's basically a mask-move with length? Can't this be done differently? If not, please describe, maybe this is already the shortest way. Regards Robin