on 2021/5/27 下午8:55, Richard Sandiford wrote: > Sorry for the slow reponse. > > "Kewen.Lin" <li...@linux.ibm.com> writes: >> diff --git a/gcc/vec-perm-indices.c b/gcc/vec-perm-indices.c >> index ede590dc5c9..57dd11d723c 100644 >> --- a/gcc/vec-perm-indices.c >> +++ b/gcc/vec-perm-indices.c >> @@ -101,6 +101,70 @@ vec_perm_indices::new_expanded_vector (const >> vec_perm_indices &orig, >> m_encoding.finalize (); >> } >> >> +/* Check whether we can switch to a new permutation vector that >> + selects the same input elements as ORIG, but with each element >> + built up from FACTOR pieces. Return true if yes, otherwise >> + return false. Every FACTOR permutation indexes should be >> + continuous separately and the first one of each batch should >> + be able to exactly modulo FACTOR. For example, if ORIG is >> + { 2, 3, 4, 5, 0, 1, 6, 7 } and FACTOR is 2, the new permutation >> + is { 1, 2, 0, 3 }. */ >> + >> +bool >> +vec_perm_indices::new_shrunk_vector (const vec_perm_indices &orig, >> + unsigned int factor) >> +{ >> + gcc_assert (factor > 0); >> + >> + if (maybe_lt (orig.m_nelts_per_input, factor)) >> + return false; >> + >> + poly_uint64 nelts; >> + /* Invalid if vector units number isn't multiple of factor. */ >> + if (!multiple_p (orig.m_nelts_per_input, factor, &nelts)) >> + return false; >> + >> + /* Only handle the case that npatterns is multiple of factor. >> + FIXME: Try to see whether we can reshape it by factor npatterns. */ >> + if (orig.m_encoding.npatterns () % factor != 0) >> + return false; >> + >> + unsigned int encoded_nelts = orig.m_encoding.encoded_nelts (); >> + auto_vec<element_type> encodings (encoded_nelts); > > auto_vec<element_type, 32> would avoid memory allocations in the > same cases that m_encoding can. “encoding” might be better than > “encodings” since there's only really one encoding here. > >> + /* Separate all encoded elements into batches by size factor, >> + then ensure the first element of each batch is multiple of >> + factor and all elements in each batch is consecutive from >> + the first one. */ >> + for (unsigned int i = 0; i < encoded_nelts; i += factor) >> + { >> + element_type first = orig.m_encoding[i]; >> + element_type new_index; >> + if (!multiple_p (first, factor, &new_index)) >> + return false; >> + for (unsigned int j = 1; j < factor; ++j) >> + { >> + if (maybe_ne (first + j, orig.m_encoding[i + j])) >> + return false; >> + } > > Formatting nit: unnecessary braces around if. > >> + encodings.quick_push (new_index); >> + } >> + >> + m_ninputs = orig.m_ninputs; >> + m_nelts_per_input = nelts; >> + poly_uint64 full_nelts = exact_div (orig.m_encoding.full_nelts (), >> factor); >> + unsigned int npatterns = orig.m_encoding.npatterns () / factor; >> + >> + m_encoding.new_vector (full_nelts, npatterns, >> + orig.m_encoding.nelts_per_pattern ()); >> + >> + for (unsigned int i = 0; i < encodings.length (); i++) >> + m_encoding.quick_push (encodings[i]); > > I think this can be: > > m_encoding.splice (encodings); > > OK with those changes, thanks. Thanks also for doing it in a > variable-length-friendly way. >
Thanks for the comments, Richard! The patch was updated as them, re-tested and committed in r12-1103. BR, Kewen