on 2021/5/27 下午8:55, Richard Sandiford wrote:
> Sorry for the slow reponse.
> 
> "Kewen.Lin" <li...@linux.ibm.com> writes:
>> diff --git a/gcc/vec-perm-indices.c b/gcc/vec-perm-indices.c
>> index ede590dc5c9..57dd11d723c 100644
>> --- a/gcc/vec-perm-indices.c
>> +++ b/gcc/vec-perm-indices.c
>> @@ -101,6 +101,70 @@ vec_perm_indices::new_expanded_vector (const 
>> vec_perm_indices &orig,
>>    m_encoding.finalize ();
>>  }
>>  
>> +/* Check whether we can switch to a new permutation vector that
>> +   selects the same input elements as ORIG, but with each element
>> +   built up from FACTOR pieces.  Return true if yes, otherwise
>> +   return false.  Every FACTOR permutation indexes should be
>> +   continuous separately and the first one of each batch should
>> +   be able to exactly modulo FACTOR.  For example, if ORIG is
>> +   { 2, 3, 4, 5, 0, 1, 6, 7 } and FACTOR is 2, the new permutation
>> +   is { 1, 2, 0, 3 }.  */
>> +
>> +bool
>> +vec_perm_indices::new_shrunk_vector (const vec_perm_indices &orig,
>> +                                 unsigned int factor)
>> +{
>> +  gcc_assert (factor > 0);
>> +
>> +  if (maybe_lt (orig.m_nelts_per_input, factor))
>> +    return false;
>> +
>> +  poly_uint64 nelts;
>> +  /* Invalid if vector units number isn't multiple of factor.  */
>> +  if (!multiple_p (orig.m_nelts_per_input, factor, &nelts))
>> +    return false;
>> +
>> +  /* Only handle the case that npatterns is multiple of factor.
>> +     FIXME: Try to see whether we can reshape it by factor npatterns.  */
>> +  if (orig.m_encoding.npatterns () % factor != 0)
>> +    return false;
>> +
>> +  unsigned int encoded_nelts = orig.m_encoding.encoded_nelts ();
>> +  auto_vec<element_type> encodings (encoded_nelts);
> 
> auto_vec<element_type, 32> would avoid memory allocations in the
> same cases that m_encoding can.  “encoding” might be better than
> “encodings” since there's only really one encoding here.
> 
>> +  /* Separate all encoded elements into batches by size factor,
>> +     then ensure the first element of each batch is multiple of
>> +     factor and all elements in each batch is consecutive from
>> +     the first one.  */
>> +  for (unsigned int i = 0; i < encoded_nelts; i += factor)
>> +    {
>> +      element_type first = orig.m_encoding[i];
>> +      element_type new_index;
>> +      if (!multiple_p (first, factor, &new_index))
>> +    return false;
>> +      for (unsigned int j = 1; j < factor; ++j)
>> +    {
>> +      if (maybe_ne (first + j, orig.m_encoding[i + j]))
>> +        return false;
>> +    }
> 
> Formatting nit: unnecessary braces around if.
> 
>> +      encodings.quick_push (new_index);
>> +    }
>> +
>> +  m_ninputs = orig.m_ninputs;
>> +  m_nelts_per_input = nelts;
>> +  poly_uint64 full_nelts = exact_div (orig.m_encoding.full_nelts (), 
>> factor);
>> +  unsigned int npatterns = orig.m_encoding.npatterns () / factor;
>> +
>> +  m_encoding.new_vector (full_nelts, npatterns,
>> +                     orig.m_encoding.nelts_per_pattern ());
>> +
>> +  for (unsigned int i = 0; i < encodings.length (); i++)
>> +    m_encoding.quick_push (encodings[i]);
> 
> I think this can be:
> 
>    m_encoding.splice (encodings);
> 
> OK with those changes, thanks.  Thanks also for doing it in a
> variable-length-friendly way.
> 


Thanks for the comments, Richard!  The patch was updated as them,
re-tested and committed in r12-1103.

BR,
Kewen

Reply via email to