On Tue, 5 Dec 2023, Robin Dapp wrote: > > But how do we know BI<N>mode fits in QImode? > > I was kind of hoping that a "bit" always fits in a "byte"/unit > but yeah, I guess we don't always know :/
But the "bit" is of constant size, so we could choose a fitting mode? > > I think the issue is more that we try to extract an element from > > the mask vector? How is element extraction defined for VLA vectors > > anyway? How can we be sure to not access out-of-bounds? > > The mask extraction I also found odd the last time we hit this. But > on aarch64 the same pattern is generated (although not via the > vec_extract path) therefore I assumed that it's not fundamentally > the wrong way. > > For the case here we only extract the last element of the vector > (nunits - 1) so out of bounds is not an issue. Regarding > out of bounds in general I was hoping we only extract when we know > that this is ok (e.g. the first or the last element). > > So supposing a mask extraction is generally ok, my main issue is > that expmed tries a BImode extract and I'm not sure this can ever > work? Can we even move into a BImode apart from comparison results? Well, the question is what can the hardware do? > I can circumvent the BImode target by going the vectorizer route and > adding: > > /* Wrong check obviously. */ > else if (can_vec_extract_var_idx_p (TYPE_MODE (vectype), > TYPE_MODE (TREE_TYPE (vectype)))) > { > tree n1 = bitsize_int (nunits - 1); > tree scalar_res > = gimple_build (&stmts, CFN_VEC_EXTRACT, TREE_TYPE (vectype), > vec_lhs_phi, n1); > > /* Convert the extracted vector element to the scalar type. */ > new_tree = gimple_convert (&stmts, lhs_type, scalar_res); > } > > to vectorizable_live_operation. > > (similar to the length and mask way). As long as we can handle > a poly_int in the extract that works as well and extracts a QImode. But why does RTL expansion not use vec_extract? Because of that BImode oddity? So yes, I guess we need to answer BImode vs. QImode. I hope Richard has a better idea here? Thanks, Richard.